datalad / datalad-deprecated

DataLad extension for functionality that has been phased out of the core package
Other
0 stars 3 forks source link

Hard-to-read structured output from "datalad ls" #18

Open TobiasKadelka opened 5 years ago

TobiasKadelka commented 5 years ago

@mih When using

datalad ls -L s3://hcp-openaccess/HCP_1200/100206/unprocessed/3T/100206_3T.csv

the output is hard to read and parse, which makes it hard to use this information.

mih commented 5 years ago

@TobiasKadelka I have transferred this issue to datalad/datalad which provides the ls command.

It would indeed be good, if ls could provide structured output in a way that makes it straightforward to wrangle it to be suitable input for addurls -t json. If addurls would additionally also support eating a JSON stream instead of an array of objects, they might even get piped together.

ATM datalad ls --long provides a custom format solution that requires a custom parser.

kyleam commented 5 years ago

@mih:

If addurls would additionally also support eating a JSON stream instead of an array of objects, they might even get piped together.

I've had a local to-do on the back burner regarding this. I'll promote it to a DataLad issue since it came up here. (In the case of listing a s3 bucket, though, I'd be a bit leery of feeding the output to addurls without inspection.)

yarikoptic commented 5 years ago

IMHO it also relates to https://github.com/datalad/datalad/pull/2126 since that one should feed records into pyout so it would become just a matter of switching "output formatter"

yarikoptic commented 5 years ago

Regarding original issue - let me know in which language you would prefer to see the one or two liner to parse it? Having said that may be indeed easiest just to write targeted code using boto in Python so it goes faster without trying to validate URLs etc

kyleam commented 5 years ago

@yarikoptic:

IMHO it also relates to https://github.com/datalad/datalad/pull/2126 since that one should feed records into pyout so it would become just a matter of switching "output formatter"

I don't think pyout is a viable solution to this issue. It (1) has a blocker 0 that I don't have any good idea on how to unblock and (2) doesn't currently have the ability to output json records (and that's not trivial to do with the current implementation).

kyleam commented 5 years ago

@yarikoptic:

Regarding original issue - let me know in which language you would prefer to see the one or two liner to parse it? Having said that may be indeed easiest just to write targeted code using boto in Python so it goes faster without trying to validate URLs etc

FWIW the second option sounds like the better one to me.

yarikoptic commented 5 years ago

Re pyout - sorry I want clear. I didn't want pyout to output json , but changes in that pr could be relevant as when to restructure code so we could could switch output renderer where one could be pyout consuming the records, and another one just print them out and json

kyleam commented 5 years ago

@yarikoptic:

Re pyout - sorry I want clear. I didn't want pyout to output json , but changes in that pr could be relevant as when to restructure code [...]

Oops, that was pretty clear reading it again. Fair enough. Sorry for the noise.