jf-tech / omniparser

omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
MIT License
971 stars 70 forks source link

Question-make an error when processing csv separated by commas #153

Closed wujunyi792 closed 3 years ago

wujunyi792 commented 3 years ago

when processing a cell containing "," in it, it will be divided into multiple pieces,although the whole cell is wrapped in quotation marks The csv data : image The parsing result: image

Is it my default?

jf-tech commented 3 years ago

@wujunyi792

1) Can you include your schema (simplified if you have trade secret concerns) and a full CSV line that triggers the error (you can also replace personal information with some generic data if you want).

2) Cannot really read very clearly given you only included screenshots here, but I somehow noticed a ' (single quote) in the value of KaoShengTeChang field and another ' in TouDangChengJi. In your schema's file_declaration section, did you use "replace_double_quotes": true? This setting is only used if you know your CSV contains improperly escaped double quotes. Looking at your csv line snippet, the double quote usage appears to be legitimate. Can you try removing the replace_double_quotes and re-run? For more details about CSV specific settings, check https://github.com/jf-tech/omniparser/blob/master/doc/csv_in_depth.md#csv-file_declaration

jf-tech commented 3 years ago

Closed due to lack of activities.