civisanalytics / civis-python

Civis API Python Client
BSD 3-Clause "New" or "Revised" License
34 stars 26 forks source link

Ensure DataFrame import successfully occurs #256

Closed ffaristocrat closed 6 years ago

ffaristocrat commented 6 years ago

This is related to this ongoing JIRA

https://civisanalytics.atlassian.net/browse/CIVP-12730

Platform import simply can't handle files that have a /" anywhere in the file.

If a double quote is escaped with a backslash, the import fails. If a file has fields enclosed by double quotes and any field ends with a /, the import fails.

This PR does two things to ensure a DataFrame will be succesfully imported via API:

  1. Cleans dataframe fields by inserting a space in between a backslash and double quotesand inserts a space at the end of any string that ends with a backslash.
  2. Exports the DataFrame to CSV with | as the delimiter, doublequotes doubled up to escape, \ as the escapechar and all fields enclosed in double quotes and imports it into platform with the same parameters.

There may be a broader goal of preparing any file (including already existing CSVs) in this fashion to ensure successful import to Platform but for now, this only addresses the dataframe to Civis situation.

I've encountered this bug while writing the Classy & MobileCommons integrations and wrote a modified version of dataframe_to_civis to handle the /" problems. This PR brings over the changes that have been succesfully importing data from those system.

I've included tests for the clean_text function.

ffaristocrat commented 6 years ago

Maybe there should be a warning if the data has problems?

The rest of the upload options (|, double quote, quote all) should be standard though as they cause the least issues uploading into platform.

stephen-hoover commented 6 years ago

Could you separate the best-practice CSV creation options into a different PR while we discuss the best approach on this one? Don't forget to update the changelog.

ffaristocrat commented 6 years ago

Will do

keithing commented 6 years ago

Closing this per @stephen-hoover's objections.