Closed gsauthof closed 9 months ago
It does already autodetect the compression as long as the filenames end with .gz
, .bz2
or .xz
.
Ok, cool. Looks like I primarily tried csvstack on zstandard compressed files.
So how about adding zstandard support then (based on the .zst
extension)?
You would have to modify this method and then submit a pull request.
The Python standard library doesn't have support for zstandard, but I can add https://pypi.org/project/zstandard/ as an optional dependency.
Say you have a directory of gzip or zstandard compressed csv files you want to merge.
For this it would be great if csvstack would auto-detect the compression, i.e. stream the files into a decompressor and process the files as usual.
Would also make sense for other tools, as many csv files compress very good and other tools support compression transparently (e.g. duckdb), meaning such support would increase interoperability when exchanging such files.
Example usage: