Closed satra closed 1 year ago
I think I located this one. It seems to work with --allow-any-path
and this is by design because we don't just want to accept GenericAsset
s. @jwodder Should I create a built-in whitelist for known GenericAsset
s, or is the solution to just make sure they're detected as something else?
(dev) [deco]~/src/bids-examples/asl003 ❱ DANDI_DEVEL=1 dandi upload -i dandi-staging --allow-any-path README
2023-04-03 12:53:01,245 [ INFO] Note: NumExpr detected 12 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2023-04-03 12:53:01,245 [ INFO] NumExpr defaulting to 8 threads.
/home/chymera/src/bids-examples/asl003/README
GenericAsset(filepath=PosixPath('/home/chymera/src/bids-examples/asl003/README'), dandiset_path=PosixPath('/home/chymera/src/bids-examples/asl003'), path='README')
2023-04-03 12:53:03,101 [ INFO] Found 1 files to consider
PATH SIZE ERRORS UPLOAD STATUS MESSAGE
README 232 Bytes 0 ERROR Error 404 while sending POST request to https://api-staging.dandiarchi...
Summary: 232 Bytes 1 ERROR 1 Error 404 while sending POST request to https://api-staging.dandiarc...
@TheChymera - given bids support, we should no longer need to use either DANDI_DEVEL
, --allow-any-path
or --validation ignore
. thus in this case i expect the system to automatically detect that the README
file is within a dandiset, and that it's a BIDS dandiset.
Given that we have odd behavior of silently ignoring non-nwb/video files in DANDI I think we should change/generalize behavior of validate
and upload
to be
.dandi/
, .git/
, .datalad/
ATM) and discover all paths@yarikoptic @TheChymera To be clear, if the user passes to dandi upload
only some of the paths in a BIDS dataset, we still need to take all files in the BIDS dataset into account for validation purposes, correct?
Your question/comment @jwodder above is in agreement with what I said during our standup today because in principle (since probably not yet implemented in the bids validator code) validating some files in BIDS dandiset might require knowledge/content of other files in BIDS dataset (e.g. while uploading _scans.tsv
files to ensure that they point to the correct/existing files etc). That is partially why the "canonical" bids-validator operates AFAIK only on the entire/complete BIDS datasets which has pros and cons to it.
Some naive utopian, but somewhat an ideal solution could have probably been: run validator on all files but then filter results/errors only for the paths which were provided by the user. It would be "naive utopian" in that it would indeed account for any possible intricate dependencies within BIDS dataset hierarchy, while "affecting" upload of only the paths user points to. If user does not restrict to any paths -- all validation errors are "in affect".
Given the "hierarchical" organization of BIDS and its inheritance principle that metadata could be inherited from directories up the hierarchy, I think we could "optimize" aforementioned naive to: take only the files from the top of the dataset to the paths pointed to by the user, validate this "subset" of the dataset, and filter for the records specifically for the paths limited to by the user. An example: dandi upload sub-12*/sub-2
would
README
, dataset_description.json
, task-*.json
etc), so possibly required metadata fields etc do become available while validating individual files under specific sub-12*/ses-2
folderssub-12*/
folders (might have some extra metadata overloads for inheritance principle)sub-12*/sub-2
.sub-12*/sub-2
folders which shell expanded to.May be @TheChymera or @effigies sees immediately where such "optimization" of validation could potentially fail?
:rocket: Issue was released in 0.55.0
:rocket:
inside a bids dandiset, we did
dandi upload README
, it found 0 files to consider.