jacopotagliabue / you-dont-need-a-bigger-boat

An end-to-end implementation of intent prediction with Metaflow and other cool tools
MIT License
842 stars 65 forks source link

Fix missing ESCAPED_DQ File Format #7

Closed bigluck closed 2 years ago

bigluck commented 2 years ago

The definition of the custom ESCAPED_DQ file format is missing on the sf_connector file, causing issues when a user is trying to upload the data into Snowflake.

Original issue: https://github.com/jacopotagliabue/you-dont-need-a-bigger-boat/issues/6

mihail911 commented 2 years ago

According to my testing this works. An alternative solution that I got to work (and successfully ingest all the data) without defining a custom format is the following snippet:

self._cs.execute(f"PUT file://{absolute_file_path} @%{table}")
self._cs.execute(f"COPY INTO {table} FILE_FORMAT = (TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY ='\"')")

This doesn't include as many delimiters that the field can be closed by, though not sure if that's an issue.

jacopotagliabue commented 2 years ago

All good, thx Luca!