Closed nfultz closed 7 years ago
Is there a StringIO-like class to wrap a file handle and cache the result, such that if the contents are read twice, it only goes to the file handle the first time? Without reading the whole file in first. The way the .txt/.tsv parser works, it reads the first N characters to see if there are any tabs, and if so, sends it to the .tsv parser which re-reads it.
It seems like it would be a lot easier to just write the psql and mysql plugins, like the existing sqlite plugin. Then writes and 'offline' browsing would be possible as well.
I thought of this right away too, but I'd do it slightly differently.
You should use a filename of -
for stdin (a very common CLI convention):
cat birdsdiet.tsv --from=tsv -
The --from=-
(stdin) you
always read the whole file and parse the string instead of the file handle. If
the parser only takes a file handle, use io.StringIO
:
https://stackoverflow.com/questions/11914472/stringio-in-python3
This would be good for using curl on open data sources with vd:
curl -s curl -s https://docs.google.com/spreadsheets/d/1gO-zUzEnPOnYMYnC9OwrBWvl2TLxpz5Y3YNIvJYRB7c/export?format=tsv | vd -`
I think that by default, stdin should be a list of commands. This is also a common CLI idiom.
It would be pretty cool if you could save the current list of commands in a
session (the list from 'D' log) to a file like birdsdiet.log
maybe, and then
replay the session with:
vd < birdsdiet.log
I might try to make a PR for this...
You can save the current list of commands from the comman'D'log to a .vd file, and then replay it with bin/vdplay
(not installed by default yet, I don't think). Or load into vd and replay with ga
. Note that vdplay does variable substitutions from command-line args too.
There is already a -f/--filetype
option to vd, which I think is what your --from
is intended to mean.
I am often using vd for GB datasets, and I worked hard to make it responsive from the start, as it continues loading in the background. So I don't want to read in the entire file before parsing. I will look into using stdin like another file descriptor. The disadvantage is as stated above, that stdin can't be rewound. But I'd rather make a StringIO replacement than suffer the performance consequences of doing it a substandard way.
That all makes sense. Sorry for not learning all the options up front.
I might make vdplay
be an option (vd --replay
) instead of adding new bins.
vd
pipe and redirect of stdin autodetected via isatty.
In order to use visidata as a pager with
psql
ormysql
, you need to be able to pipe in data in addition to specifying a file. I had a similar app, here is the snippet that did this:https://github.com/nfultz/ffss/blob/master/ffss/__init__.py#L60