Open vancelopez opened 5 years ago
@vancelopez Thanks for the request. I'm not very familiar with the Analytics External Data API. It looks like a pretty straightforward HTTP request of a CSV file with JSON header. Do you have an example of the type of CSV data you'd like to upload? I'm not sure what the type of file you want to upload would look like. Thanks.
Thanks, @StevenMMortimer for the quick reply. The CSVs I upload (via web interface) are basic datasets (i.e. iris
) that vary in size. The largest I upload is about 1 Gb. It looks like the External Data API wants the data in 10 mb chunks. Also, I believe Salesforce requires at least one "dimension" on external data, so there is a requirement for a character variable.
The metadata JSON could be created programmatically referencing Analytics External Data Format. However, it would also be nice to have the option to upload a user defined JSON file - I've done that with the large dataset because there are over 50 columns and so it becomes cumbersome to define datatypes using the point and click interface.
@vancelopez I did some testing and realized that you can already use the salesforcer package to push data to the Analytics External Data API because it involves just creating a few records as you would normally do with any other object in Salesforce. Below is an example for you to follow.
I'll continue working on creating a specific function in salesforcer that will reduce this routine to a single function that will also do the more exciting things like inferring the data structure's JSON and stuff like that. However, this should be enough for you to start uploading your own data via the API. Let me know if you have any issues.
# load libraries that are required to run this example
library(salesforcer) # needed to interact with the Salesforce APIs
library(nycflights13) # package with a large dataset to test with
library(dplyr) # needed for bind_rows()
library(readr) # needed to write the csv out
library(iotools) # needed to process and create the binary files in <10MB chunks
library(base64enc) # needed to encode those data parts to be sent across the API
# authenticate to Salesforce as you normally would
sf_auth(username = username,
password = password,
security_token = security_token,
login_url = 'https://login.salesforce.com/')
# insert a row into the InsightsExternalData
input_data <- c("Format"="Csv",
"EdgemartAlias"="flights",
#"MetadataJson"=metadataJson, # FUTURE ENHANCEMENT - CURRENTLY SUPPLY YOUR OWN JSON USING THE jsonlite PACKAGE and encode it using base64encode
"Operation"="Overwrite",
"Action"="None")
ied_header_record <- sf_create(input_data, object_name="InsightsExternalData")
# write the data as a CSV into a temp file
f <- tempfile()
write_csv(flights, f)
# process the CSV file into chunks of < 10MB each and convert them to base64 encoded strings
data_chunks <- chunk.apply(input = f,
CH.MERGE = list,
CH.MAX.SIZE = 7000000, #10485760 resulted in ~13mb chunks so use smaller
FUN = base64encode)
# for each chunk of the dataset insert it into the InsightsExternalDataPart object
# using the part number and the Id of the InsightsExternalData record we created initially
all_parts <- NULL
for(i in 1:length(data_chunks)){
input_data <- c("DataFile" = data_chunks[[i]][1],
"InsightsExternalDataId" = ied_header_record$id,
"PartNumber" = i)
this_part <- sf_create(input_data, object_name="InsightsExternalDataPart")
all_parts <- bind_rows(all_parts, this_part)
}
# stop if not all the parts were successfully uploaded
stopifnot(all(all_parts$success))
# now start the processing of the dataset by Salesforce
input_data <- c("Id"=ied_header_record$id, "Action"="Process")
ied_header_record_start <- sf_update(input_data, object_name="InsightsExternalData")
# keep checking back until the Status="Completed"
sf_retrieve(ied_header_record$id, fields=c("Status","StatusMessage"), object_name="InsightsExternalData")
Here is a screenshot of what the flights
dataset looks like inside the Analytics dashboard
Incredible. Thanks @StevenMMortimer. I will test.
@StevenMMortimer Successfully uploaded my own data to Analytics using your example. Thanks for the work on this!
I already had a metadata JSON text file, so included it in the header record with:
metadatajson <- base64encode::base64encode("C:/...")
I also tried first reading in the JSON using read_json()
, then converting back with base64encode(toJSON(jsontext))
but it returned an error, "Invalid data in metadata property". I have not worked much with jsonlite.
Salesforce.com documentation.
Would be great to be able to add Analytics datasets via API.
Particular use case: pull data from onsite SQL Server, transformations, push via salesforcer to Analytics.