dandi / dandi-cli

DANDI command line client to facilitate common operations
https://dandi.readthedocs.io/
Apache License 2.0
22 stars 25 forks source link

upload doesn't work when pointed to a file inside a dandiset #1252

Closed satra closed 1 year ago

satra commented 1 year ago

inside a bids dandiset, we did dandi upload README, it found 0 files to consider.

TheChymera commented 1 year ago

I think I located this one. It seems to work with --allow-any-path and this is by design because we don't just want to accept GenericAssets. @jwodder Should I create a built-in whitelist for known GenericAssets, or is the solution to just make sure they're detected as something else?

(dev) [deco]~/src/bids-examples/asl003 ❱ DANDI_DEVEL=1 dandi upload -i dandi-staging --allow-any-path README
2023-04-03 12:53:01,245 [    INFO] Note: NumExpr detected 12 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2023-04-03 12:53:01,245 [    INFO] NumExpr defaulting to 8 threads.
/home/chymera/src/bids-examples/asl003/README
GenericAsset(filepath=PosixPath('/home/chymera/src/bids-examples/asl003/README'), dandiset_path=PosixPath('/home/chymera/src/bids-examples/asl003'), path='README')
2023-04-03 12:53:03,101 [    INFO] Found 1 files to consider
PATH                 SIZE      ERRORS UPLOAD STATUS                MESSAGE
README               232 Bytes   0           ERROR                 Error 404 while sending POST request to https://api-staging.dandiarchi...
Summary:             232 Bytes               1 ERROR               1 Error 404 while sending POST request to https://api-staging.dandiarc...
satra commented 1 year ago

@TheChymera - given bids support, we should no longer need to use either DANDI_DEVEL, --allow-any-path or --validation ignore. thus in this case i expect the system to automatically detect that the README file is within a dandiset, and that it's a BIDS dandiset.

yarikoptic commented 1 year ago

Given that we have odd behavior of silently ignoring non-nwb/video files in DANDI I think we should change/generalize behavior of validate and upload to be

jwodder commented 1 year ago

@yarikoptic @TheChymera To be clear, if the user passes to dandi upload only some of the paths in a BIDS dataset, we still need to take all files in the BIDS dataset into account for validation purposes, correct?

yarikoptic commented 1 year ago

Your question/comment @jwodder above is in agreement with what I said during our standup today because in principle (since probably not yet implemented in the bids validator code) validating some files in BIDS dandiset might require knowledge/content of other files in BIDS dataset (e.g. while uploading _scans.tsv files to ensure that they point to the correct/existing files etc). That is partially why the "canonical" bids-validator operates AFAIK only on the entire/complete BIDS datasets which has pros and cons to it.

Some naive utopian, but somewhat an ideal solution could have probably been: run validator on all files but then filter results/errors only for the paths which were provided by the user. It would be "naive utopian" in that it would indeed account for any possible intricate dependencies within BIDS dataset hierarchy, while "affecting" upload of only the paths user points to. If user does not restrict to any paths -- all validation errors are "in affect".

Given the "hierarchical" organization of BIDS and its inheritance principle that metadata could be inherited from directories up the hierarchy, I think we could "optimize" aforementioned naive to: take only the files from the top of the dataset to the paths pointed to by the user, validate this "subset" of the dataset, and filter for the records specifically for the paths limited to by the user. An example: dandi upload sub-12*/sub-2 would

May be @TheChymera or @effigies sees immediately where such "optimization" of validation could potentially fail?

github-actions[bot] commented 1 year ago

:rocket: Issue was released in 0.55.0 :rocket: