Open reagle opened 8 months ago
To answer your first question, I went through and looked in the source for def open_*()
and def save_*()
Here's a list of every extension/filetype that visidata can read+write:
arrow gsheets npy tsv xd
arrows html org txt xls
csv jrnl parquet usv xlsx
dta jsonl png vdj xml
fixed jsonla rec vds zip
geojson lsv sqlite vdx
And here's what it can read, but not write:
airtable forg mh pdf toml
babyl frictionless mmdf puz ttf
bytes gdrive mnu pyprof vcf
conll git npz reddit vd
conllu h5 ods sas7bdat xlsb
eml jsonobj orgdir scrape xpt
f5log maildir pandas shp yml
fdir mbox pbf spss zulip
fec mbtiles pcap tar
And there seem to be a few it can write but not read: dot svg
. And there are several extensions that are not exactly full-fledged file types, they are types of tables in the tabulate
library, for table files (see loaders/texttable.py
. The list of these includes jira md table
(and more) that it can write, but not read.
Where is a good place you'd like to see this information? It could go in a table like https://visidata.org/docs/formats/, perhaps in one of the guides? Accessible by a command like open-format-guide
?
It's not clear if multiple extensions work (i.e., format+compression).
This does not work currently, but it's on my wishlist too. I'd be interested in a PR that addressed this.
the resulting file BestofRedditorUpdates_submissions.vds.zst is smaller, but cannot be reopened: "Unsupported operation: Underlying stream is not seakable."
Can the file be decompressed manually and then opened as .vds? If so, then it's likely a bug in the vds loader (otherwise it's a bug in the vds saver). This is a bug either way though.
Also I would support vdz
as an alias for vds+zstd
when that's possible.
I'm not sure what the best way to do this is, but some thoughts:
@saulpw I loaded the vds file, saved it as BestofRedditorUpdates_submissions.vds.zstd
, which is a smaller file size, but am unable to decompress manually.
❯ zstd --decompress BestofRedditorUpdates_submissions.vds.zstd
zstd: BestofRedditorUpdates_submissions.vds already exists; overwrite (y/n) ? y
zstd: BestofRedditorUpdates_submissions.vds.zstd: unsupported format
Two stories about how I could use more guidance or guard rails when saving work. Presently, I have to look up and refer to the supported formats, and then my choices often don't work.
Usenet
VisiData helped me find Elizabeth Edwards' (famous) participation on Usenet's alt.support.grief; vd can read Internet Archives mbox format and make quick work of searching.
Saving the derivative sheet is tricky though. vd defaults to tsv (even if I give the mbox extension), but there's is no mbox save support, so I don't know what the resulting file format is anymore and I don't think vd does either when I return to the file. (I can save to csv, which is okay, but the result has some odd character conversions.)
Reddit
I'm analyzing posts on a subreddit which are in a "zstandard compressed ndjson" file. vd opens it well, but after some manipulations, I want to save it so I can return to the data as is, so vds seems like a natural format. And it works! However, I think, why not save it as compressed, and the resulting file
BestofRedditorUpdates_submissions.vds.zst
is smaller, but cannot be reopened: "Unsupported operation: Underlying stream is not seakable."Consequently, relying on the file extension is problematic outside of the simplest cases because: