BurntSushi / xsv

A fast CSV command line toolkit written in Rust.
The Unlicense
10.35k stars 321 forks source link

`xsv table` has unexpected output with long input (too-wide tabs, missing tabs) #345

Open DefaultGen opened 3 months ago

DefaultGen commented 3 months ago

Hi, I'm trying to use xsv 0.13.0 as a replacement for the unix column command in a script because it is much faster. However I get lines with strange spacing even with what I think is simple, although long, input.

I have a simple two column .csv like this:

title,ext
exqz3k,.md
fov1sy,.md
g421ji,.md
...

If I run cat simple_example.csv | xsv table > outputFile, I can see the following weird lines (there's a few more examples of both issues throughout the output):

83pddb  .md
z4veyi.md           (Line 14895, no space between title and ext)
3bl6ub  .md
...
qvxr41  .md
scw1ye        .md   (Line 17874, large tab between title and ext)
eti8p6  .md

With more complex data, the output errors happen much more frequently (in attached complex_example.csv, some errors start around line 1113 for example).

I can reproduce this with any sufficiently large data set. I've attached the two randomly generated data sets. I understand xsv table is probably just meant to print a few human readable lines to the terminal, so if this kind of processing isn't in scope for the project I'll stick to the venerable column :-)

I'm on Arch Linux with xsv installed from pacman. I reproduced this on Debian as well.

simple_example.csv complex_example.csv

DefaultGen commented 3 months ago

As an even simpler test, I made a csv with a header, then foo,bar repeated 20,000 times:

1,2
foo,bar
foo,bar
foo,bar

The resulting xsv table output has numerous no-tab lines that say foobar. Interestingly if I remove the 1,2 header, I don't see the issue even if I repeat it 10M times.

lespea commented 3 months ago

You'll probably have better luck with https://github.com/jqnatividad/qsv/ -- this hasn't been updated in over 6 years.

DefaultGen commented 3 months ago

You'll probably have better luck with https://github.com/jqnatividad/qsv/ -- this hasn't been updated in over 6 years.

Thanks a lot, I hadn't found that! The qsv table command works perfect for me.