wireservice / csvkit

A suite of utilities for converting to and working with CSV, the king of tabular file formats.
https://csvkit.readthedocs.io
MIT License
5.9k stars 605 forks source link

csvstack is column agnostic and corrupts output #1233

Closed ulfh closed 3 months ago

ulfh commented 3 months ago

csvstack does not check or use column names when stacking csv files. This means that reoredered columns, missing columns, or other similar changes will corrupt the produced csv output with no warning. In essence, everything that makes a tool valuable to use is nullified by this issue.

For instance, merging these two files happily produces the result below. Clearly at least a warning should be issued.

a,b
1,2
b,a
1,2
a,b
1,2
1,2

csvstack also (unsurprisingly) does not check the number of columns either. This results in corrupted csv files if the number of columns varies between the input data.

$ csvstack --version
csvstack 1.0.6
jpmckinney commented 3 months ago

See #245. We can issue a warning.

ulfh commented 3 months ago

This appears to have been resolved as of version 1.1.1. Closing.