Textualize / rich-cli

Rich-cli is a command line toolbox for fancy output in the terminal
https://www.textualize.io
MIT License
2.99k stars 77 forks source link

"Could not determine delimiter" when trying to render TSV via stdin #54

Open hjacobs opened 2 years ago

hjacobs commented 2 years ago

Rendering TSV (tab-separated values) works when passing a file name:

rich temp.tsv

But it fails for the same file when passing as stdin (-) with error "Could not determine delimiter":

cat temp.tsv | rich - --csv

Apparently the CSV/TSV sniffer does not work correctly and the detection via the file extension (.tsv) makes it work (excel-tab dialect of csv parser) when passing the file name, but not when passing the same data via stdin (-).

hjacobs commented 2 years ago

OK, apparently the problem is with only sniffing truncated data ([:1024]) which can break the CSV sniffer algorithm as it tries to detect the delimiter by counting the occurrences on each line (and truncating in the middle of a line will therefore corrupt the data for the sniffer).

Changing the logic to sniff the first N lines instead of first 1024 characters would solve this issue.

patatetom commented 2 years ago

hi, and/or adding a --delim option (or something similar) on the command line to force the definition (in case of detection problem for example)... regards.

harkabeeparolus commented 1 year ago

OK, apparently the problem is with only sniffing truncated data ([:1024]) which can break the CSV sniffer algorithm as it tries to detect the delimiter by counting the occurrences on each line (and truncating in the middle of a line will therefore corrupt the data for the sniffer).

Wow, is that the reason why?!? ๐Ÿ˜ฒ I've been wondering for years why the code example in the official Python csv.Sniffer docs does not seem to work. I never realized it is because it breaks in the middle of a line. ๐Ÿคจ

Seems to me this should be fixed in the official Python docs as well, since I've never managed to get it to work...

Anyway, thanks for this gem! ๐Ÿ˜Š

luckman212 commented 1 year ago

Is there any --delim or similar option to force delim detection?

YUKI2eN3e commented 1 year ago

I just submitted a pull request to add --csv-format that lets you set the dialect to use.