singer-io / tap-marketo

GNU Affero General Public License v3.0
9 stars 17 forks source link

Modify stream_rows to better support encodings #51

Closed KAllan357 closed 5 years ago

KAllan357 commented 5 years ago

decode_unicode=True was causing errors so we open the tempfile in binary mode, write the raw bytes and then open the file for the csv reader using utf-8.

erameshbabu commented 3 years ago

Suggested fix is outlined here.

https://github.com/airbytehq/airbyte/issues/4405#issuecomment-870936481

erameshbabu commented 3 years ago

decode_unicode=True was causing errors so we open the tempfile in binary mode, write the raw bytes and then open the file for the csv reader using utf-8.

@KAllan357 Could you please check whether the suggestion provided in (https://github.com/airbytehq/airbyte/issues/4405#issuecomment-870936481) fix the issue? In our local environment, the issue is resolved (we had issues in reading Chinese characters from Marketo, this fix resolved it). Thanks.

Fix: Need to explicitly set the response encoding as 'utf_8' before iterating through the data.


resp = client.stream_export(stream_type, export_id) resp.encoding = 'utf-8' for chunk in resp.iter_content(chunk_size=CHUNK_SIZE_BYTES, decode_unicode=True):