Open lhoestq opened 1 year ago
Thanks for pointing out, @lhoestq.
On the one hand, should we support column names with quotes?
On the other hand, this specific dataset just contains a CSV without header row: the column name is indeed text content.
I agree at least we could catch this error, as we already do in /filter, and raise a specific error.
No need to spend time on this imo, I mostly created this issue to save time next time we see a similar issue. But yea a nice error message would be better
i think we actually should disallow column names with any weird characters, i'd do something similar to validation in /filter feature (disallowing ";", "--", r"/\*", r"\*/"
), wanted to work on this
Not super important but affects the
split-descriptive-statistics
and thesplit-duckdb-index
jobse.g. this dataset has a pretty long column name that has quotes and it raises this error