bids-standard / bids-validator

Validator for the Brain Imaging Data Structure
https://bids-standard.github.io/bids-validator/
MIT License
184 stars 109 forks source link

spit out more informative/meaningful msg whenever some file is not actually accessible #197

Closed yarikoptic closed 4 years ago

yarikoptic commented 8 years ago

just trying on a git-annex repo without files actually fetched (just to validate the hierarchy, file names), but the error is not really 'informative' -- I wish it could at least say which file it wants to see

hopa:/tmp/datasets.datalad.org/openfmri/ds000206/sourcedata
$> ~/bin/bids-validator-docker .
fs.js:839
  return binding.lstat(pathModule._makeLong(path));
                 ^

Error: ENOENT: no such file or directory, lstat './proc/1/fd/12'
    at Error (native)
    at Object.fs.lstatSync (fs.js:839:18)
    at getFiles (/usr/lib/node_modules/bids-validator/utils/files.js:100:16)
    at getFiles (/usr/lib/node_modules/bids-validator/utils/files.js:101:13)
    at getFiles (/usr/lib/node_modules/bids-validator/utils/files.js:101:13)
    at getFiles (/usr/lib/node_modules/bids-validator/utils/files.js:101:13)
    at Object.readDir (/usr/lib/node_modules/bids-validator/utils/files.js:78:21)
    at Object.BIDS.start [as BIDS] (/usr/lib/node_modules/bids-validator/validators/bids.js:32:21)
    at module.exports (/usr/lib/node_modules/bids-validator/cli.js:12:18)
    at Object.<anonymous> (/usr/lib/node_modules/bids-validator/bin/bids-validator:20:18)

FWIW --verbose wasn't helpful much

constellates commented 8 years ago

This is not an intended validator error it's a language feature inside the validator throwing an un-handled error because it wasn't expecting the git annex formatting.

Is this the correct url (datasets.datalad.org/openfmri/ds000206/sourcedata) to fetch the git-annex data to re-create this?

yarikoptic commented 8 years ago

On Tue, 20 Sep 2016, constellates wrote:

This is not an intended validator error it's a language feature inside the validator throwing an un-handled error because it wasn't expecting the git annex formatting.

Is this the correct url (datasets.datalad.org/openfmri/ds000206/sourcedata) to fetch the git-annex data to re-create this?

if just to fetch the "structure" do

git clone http://datasets.datalad.org/openfmri/ds000206/.git

but if you need the load as well, since for now only from tarballs, you would need datalad and try then

datalad install ///openfmri/ds000206 cd ds000206 git annex get sourcedata

(after https://github.com/datalad/datalad/pull/613 is merged, we will release and there would be no need in manual git annex get)

constellates commented 8 years ago

Hmm.. After cloning that dataset and running the validator I get the following

$ ./bin/bids-validator ../ds000206/
    1: This file appears to be an orphaned symlinked. Make sure it correctly points to its referent. (code: 43)
        /sub-THP0001/ses-THP0001CCF1/anat/sub-THP0001_ses-THP0001CCF1_run-01_T1w.json
        /sub-THP0001/ses-THP0001CCF1/anat/sub-THP0001_ses-THP0001CCF1_run-01_T1w.nii.gz
        /sub-THP0001/ses-THP0001CCF1/anat/sub-THP0001_ses-THP0001CCF1_run-01_T2w.json
        /sub-THP0001/ses-THP0001CCF1/anat/sub-THP0001_ses-THP0001CCF1_run-01_T2w.nii.gz
        /sub-THP0001/ses-THP0001CCF1/dwi/sub-THP0001_ses-THP0001CCF1_acq-GD31_run-01_dwi.bval
        /sub-THP0001/ses-THP0001CCF1/dwi/sub-THP0001_ses-THP0001CCF1_acq-GD31_run-01_dwi.bvec
        /sub-THP0001/ses-THP0001CCF1/dwi/sub-THP0001_ses-THP0001CCF1_acq-GD31_run-01_dwi.json
        /sub-THP0001/ses-THP0001CCF1/dwi/sub-THP0001_ses-THP0001CCF1_acq-GD31_run-01_dwi.nii.gz
        /sub-THP0001/ses-THP0001CCF1/dwi/sub-THP0001_ses-THP0001CCF1_acq-GD31_run-02_dwi.bval
        /sub-THP0001/ses-THP0001CCF1/dwi/sub-THP0001_ses-THP0001CCF1_acq-GD31_run-02_dwi.bvec
        ... and 1502 more files having this issue (Use --verbose to see them all).

    1: This file is not part of the BIDS specification, make sure it isn't included in the dataset by accident. Data derivatives (processed data) should be placed in /derivatives folder. (code: 1)
        /.datalad/config
            Evidence: config
        /.datalad/crawl/crawl.cfg
            Evidence: crawl.cfg
        /.datalad/crawl/statuses/incoming.json
            Evidence: incoming.json
        /.datalad/crawl/versions/incoming.json
            Evidence: incoming.json
        /.datalad/meta/meta.json
            Evidence: meta.json
        /.gitattributes
            Evidence: .gitattributes

    2: Not all subjects contain the same files. Each subject should contain the same number of files with the same naming unless some files are known to be missing. (code: 38)
        /sub-THP0001/ses-THP0002CCF1/anat/sub-THP0001_ses-THP0002CCF1_run-01_T1w.json
        /sub-THP0001/ses-THP0002CCF1/anat/sub-THP0001_ses-THP0002CCF1_run-01_T1w.nii.gz
        /sub-THP0001/ses-THP0002CCF1/anat/sub-THP0001_ses-THP0002CCF1_run-01_T2w.json
        /sub-THP0001/ses-THP0002CCF1/anat/sub-THP0001_ses-THP0002CCF1_run-01_T2w.nii.gz
        /sub-THP0001/ses-THP0002CCF1/dwi/sub-THP0001_ses-THP0002CCF1_acq-GD31_run-01_dwi.bval
        /sub-THP0001/ses-THP0002CCF1/dwi/sub-THP0001_ses-THP0002CCF1_acq-GD31_run-01_dwi.bvec
        /sub-THP0001/ses-THP0002CCF1/dwi/sub-THP0001_ses-THP0002CCF1_acq-GD31_run-01_dwi.json
        /sub-THP0001/ses-THP0002CCF1/dwi/sub-THP0001_ses-THP0002CCF1_acq-GD31_run-01_dwi.nii.gz
        /sub-THP0001/ses-THP0002CCF1/dwi/sub-THP0001_ses-THP0002CCF1_acq-GD31_run-02_dwi.bval
        /sub-THP0001/ses-THP0002CCF1/dwi/sub-THP0001_ses-THP0002CCF1_acq-GD31_run-02_dwi.bvec
        ... and 7550 more files having this issue (Use --verbose to see them all).

        Summary:                    Available Tasks:        Available Modalities:
        1934 Files, 849.22kB                                T1w
        6 - Subjects                                        T2w
        55 - Sessions                                       dwi

And if I run it on the sourcedata like your example I get this.

$ ./bin/bids-validator ../ds000206/sourcedata/
This does not appear to be a BIDS dataset. For more info go to http://bids.neuroimaging.io/

I'm also not seeing files like this './proc/1/fd/12' (from your original stacktrace) in the sourcedata. Is there a process filesystem in your sourcedata?

yarikoptic commented 8 years ago

On Tue, 20 Sep 2016, constellates wrote:

Hmm.. After cloning that dataset and running the validator I get the following

THANKS for trying! And I am sorry -- I should have looked in detail myself... I was using a dockerized version ran by the script:

$> cat ~/bin/bids-validator-docker

!/bin/sh

set -eu docker run -it --rm -v "${PWD}:${PWD}:ro" bids/base_validator bids-validator "$@"

but the problem was that I provided relative path to the dataset... providing full path works...

hopa:/tmp/datasets.datalad.org/openfmri/ds000206 $> ~/bin/bids-validator-docker $PWD/
1: Not a valid JSON file. (code: 27) /sub-THP0001/ses-THP0001DART1/anat/sub-THP0001_ses-THP0001DART1_run-01_T2w.json @ line: 9 character: 5 Evidence: "PulseSequenceType": "SPACE"

feel free to close unless you do want to catch that exception and spit out some less scary message ;)

constellates commented 8 years ago

No worries. I'm glad you figured it out and I got a small introduction to datalad in the process! I'll keep this open for the time being so we can potentially add some more error handling/reporting around our file stat checking.

yarikoptic commented 4 years ago

world was a much less a better place back then, I think it improved and new dedicated issues would make it even better