Closed jhilyard closed 3 years ago
Thank you for filing this issue. We appreciate your feedback and will review the issue as soon as possible. Remember, however, that GitHub isn't a mechanism for receiving support under any agreement or SLA. If you require immediate assistance, contact Salesforce Customer Support.
Here's a possible one-line fix: adding parser options
let parser = parse({
columns: true,
skip_empty_lines: true
});
add the line bom:true
:
let parser = parse({
bom: true,
columns: true,
skip_empty_lines: true
});
as recommended per csv-parse documentation
@jhilyard sorry for the delay in response, but thank you for digging into the code and suggesting a fix. We just OSS'd that plugin https://github.com/salesforcecli/data/tree/main/packages/plugin-data I'll put up a PR there adding that option
This issue has been linked to a new work item: W-8836961
Hi @jhilyard check out the announcement for the new data plugin
I think this is fixed in the new data plugin released 4/1. If not, re-open or create a new case.
@WillieRuemmele @mshanemc and other contributors please accept my belated thanks for the UTF-8 BOM CSV fix! My attention was elsewhere when the data plugin was released; but I've been using it with no problems and I appreciate your efforts.
Summary
sfdx force:data:bulk:insert
fails with UTF-8 with BOM encoded CSV - it treats the byte order mark as part of first field nameSteps To Reproduce:
Repository to reproduce: sfdx_upsert_utf8bom
More repro step explanation is provided in the readme.md
NOTE: If your issue is not reproducable by dreamhouse-lwc, i.e. requires specific metadata or files, we require a link to a simple Salesforce project repository with a script to setup a scratch org that reproduces your problem.
sfdx force:org:create -a MyScratchOrg -f config/project-scratch-def.json -s
sfdx force:source:push
sfdx force:data:bulk:upsert -s Account -f ./Account_UTF8_NO_BOM.csv -i Id -w 2
sfdx force:data:bulk:upsert -s Account -f ./Account_UTF8_BOM.csv -i Id -w 2
sfdx force:data:bulk:status -i <jobId> -b <batchId>
Expected result
Accounts with Name BOM and No_BOM are present.
Actual result
Account with Name No_BOM is present. The upsert using the UTF-8 BOM file fails because the byte order mark is considered part of the first column name in the CSV. The command exits to the prompt before the timeout, checking the job batch status shows
InvalidBatch : Field name not found : Name
. Pasting the message into VS Code shows a non-printable character symbol at the beginning of the field name Name.Additional information
The documentation only refers, without providing a link, to Preparing a CSV, which only says files must be in UTF-8 format. Providing a warning about UTF-8 BOM incompatibility on the documentation page would be a greatly appreciated workaround.
Please note: Tableau Prep Builder (at least the latest version, 2020.3.1) creates output CSV files with UTF-8 BOM encoding -- only! There is no indication of encoding, the choice is only "hyper" format or "CSV". Since Tableau Prep Builder can pull data from Salesforce (to get lookup Ids), databases, and text files, and is in the Salesforce ecosystem, it seemed like a good plan to use Tableau Prep Builder and sfdx force:data:bulk:insert for scripting incremental loads since Tableau Prep Builder itself cannot output to Salesforce but is much more user-friendly than scripting Data Loader. That's what got me in this mess.
SFDX CLI Version(to find the version of the CLI engine run sfdx --version):
sfdx-cli/7.78.1-5a65d9dd2f win32-x64 node-v12.18.3
SFDX plugin Version(to find the version of the CLI plugin run sfdx plugins --core)
OS and version:
Windows 10 Version 10.0.18363 Build 18363