Open jelm-vw opened 3 years ago
Same issue here.
@jelm-vw can you paste your solution or make a fork?
This is effectively the (temporary) monkey-patch I use:
# patch.py
from functools import wraps
def _encode_job_data(prepare_data):
@wraps(prepare_data)
def wrapper(*args, **kwargs):
original: str = prepare_data(*args, **kwargs)
encoded: bytes = original.encode('utf-8')
return encoded
return wrapper
def patch_salesforce_api(salesforce_api):
salesforce_api.services.bulk.v2.Job._prepare_data = _encode_job_data(salesforce_api.services.bulk.v2.Job._prepare_data)
# some other module
import salesforce_api
import patch
patch.patch_salesforce_api(salesforce_api)
@jelm-vw It works! Thanks a million!
Nice find! And nice workaround! I will create a PR for this, this weekend, and make sure to attempt to detect the data encoding before encoding it!
How can I use your monkey-patch in my code
from salesforce_api import Salesforce
client = Salesforce(...)
...
client.bulk.upsert('Account', accounts)
...
Salesforce requires the uploaded data to be encoded as (or at least compatible with) UTF-8. (https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/datafiles_prepare_csv.htm, fourth bullet point from the top). Though, in practice, upload jobs with higher-code-point characters fail in Python before the ingest request can be sent to Salesforce.
The
bulk
client does not encode the CSV data, which remains as typestr
until a lower-level package must make an encoding decision. The low-level Pythonhttp
library sees astr
object and tries to make abytes
out of it by encoding to the HTTP-default, ISO-8859-1. But I pass it data that is not compatible with that encoding, so it raises aUnicodeEncodeError
.Here is a contrived example of something that should work but doesn't:
As a workaround, in the codebase I'm working in, I've monkey-patched
salesforce_api.services.bulk.v2.Job._prepare_data
such that it callsencode('utf-8')
and returnsbytes
. I've not submitted a PR to change this function, as there's a stack of calling functions that all expectstr
, so encoding then and there may not be the desired long-term fix. But the patch works for now.