Open mateuszpawlik opened 2 months ago
I understand that legacy validator may be of lower priority, but is there anything I can do to fix this?
I personally don't have any notion how to start investigating this. Are you also seeing the issue with the schema validator?
If you mean the Deno based validator, I'm planning to execute it. I'm fetching the data now. Once I do it, I will report, but possible next week. Thanks.
I've tried the Deno validator. It also reports weird issues for most of the files, which should not be reported. I checked some of the reported files manually and there should be no issue.
I used this command:
docker run -ti --rm -v $PWD:/data:ro denoland/deno deno run --allow-read --allow-env https://deno.land/x/bids_validator@v1.14.8/bids-validator.ts /data
I also get "[WARNING] The onset column in events.tsv files should be sorted. (EVENT_ONSET_ORDER)". My guess is, that this is because we have negative onsets at the beginning.
The number of files seems to be reported correctly in the summary of both validators: 4428 Files.
Update: These issues show even if I bidsignore all subjects but one.
We had some invalid json files in the dataset. We new about them. Fixing them caused the legacy validator to work more as expected, not reporting the errors, which I listed originally. It did not help the issues reported by the Deno validator.
It seems that something is happening internally, which causes throwing other errors when some json files are invalid. I remember reading about similar issues. Are there maybe any errors set by default and reported when something else breaks?
I've tried the Deno validator. It also reports weird issues for most of the files, which should not be reported. I checked some of the reported files manually and there should be no issue.
I used this command:
docker run -ti --rm -v $PWD:/data:ro denoland/deno deno run --allow-read --allow-env https://deno.land/x/bids_validator@v1.14.8/bids-validator.ts /data
We've been making a bunch of fixes in the last week or so since that release, so can you re-test with:
deno run --reload -A https://github.com/bids-standard/bids-validator/raw/master/bids-validator/src/bids-validator.ts
?
* [WARNING] NIfTI file's header field for dimension information is blank or too short. (NIFTI_DIMENSION)
This means that either dim[0] == 0
or min(dim[1:dim[0]]) <= 0
. If that's not the case, that's a bug. If you could share the NIfTI header, I could test this. Simplest way to share:
python -c "import sys, nibabel; print(nibabel.load(sys.argv[1]).to_bytes()[:348])" <PATH>
* [WARNING] NIfTI file's header field for pixel dimension information is empty or too short. (NIFTI_PIXDIM)
min(pixdim[1:dim[0]]) <= 0
* [WARNING] A data file's JSON sidecar is missing a key listed as recommended. (SIDECAR_KEY_RECOMMENDED)
This is almost impossible not to have, and a result of the schema validator systematically reporting RECOMMENDED sidecar fields. It's going to be noisy because BIDS has a lot of these, but they've only been selectively applied in the legacy validator. (Preview: I'm going to be advocating for reducing many fields to OPTIONAL.)
* [ERROR] Repetition time did not match between the scan's header and the associated JSON metadata file. (REPETITION_TIME_MISMATCH)
This could be a rounding problem: https://github.com/bids-standard/bids-validator/issues/2091
* [WARNING] You must define 'EchoTime' for this file. 'EchoTime' is the echo time (TE) for the acquisition, specified in seconds. (...)
This is a problem in the schema, I'm going to submit a patch today.
I also get "[WARNING] The onset column in events.tsv files should be sorted. (EVENT_ONSET_ORDER)". My guess is, that this is because we have negative onsets at the beginning.
This is surprising. I've just checked the sorting function and it should handle negative numbers fine. Would you open an issue with a failing file?
Now I'm getting the following error with the legacy validator, which doesn't make sense to me, because why the json files in phenotype directory should be validated against a schema and have the listed properties.
[ERR] Invalid JSON file. The file is not formatted according the schema. (code: 55 - JSON_SCHEMA_VALIDATION_ERROR)
./phenotype/ASR.json
Invalid JSON file. The file is not formatted according the schema.
Evidence: should have required property 'PixelSize'
./phenotype/ASR.json
Invalid JSON file. The file is not formatted according the schema.
Evidence: should have required property 'PixelSizeUnits'
Please visit https://neurostars.org/search?q=JSON_SCHEMA_VALIDATION_ERROR for existing conversations about this issue.
That looks like it's being picked up by a microscopy rule.
It seems that something is happening internally, which causes throwing other errors when some json files are invalid. I remember reading about similar issues. Are there maybe any errors set by default and reported when something else breaks?
I don't know about that, but I've only been tangentially involved in the legacy validator. @rwblair might remember something here?
That looks like it's being picked up by a microscopy rule.
That is weird but thank you. Renaming the file resolves this issue but it's not really a solution.
We've been making a bunch of fixes in the last week or so since that release, so can you re-test with:
deno run --reload -A https://github.com/bids-standard/bids-validator/raw/master/bids-validator/src/bids-validator.ts
?
docker run -ti --rm -v $PWD:/data:ro denoland/deno deno run --allow-read --allow-env --reload -A https://github.com/bids-standard/bids-validator/raw/master/bids-validator/src/bids-validator.ts /data
error: Relative import path "@std/path" not prefixed with / or ./ or ../
at https://raw.githubusercontent.com/bids-standard/bids-validator/master/bids-validator/src/main.ts:5:25
This issue grew too much :see_no_evil: We can split it.
Damn. Okay, apparently that method was disabled by #2077. Use https://github.com/bids-standard/bids-validator/raw/deno-build/bids-validator.js.
I also get "[WARNING] The onset column in events.tsv files should be sorted. (EVENT_ONSET_ORDER)". My guess is, that this is because we have negative onsets at the beginning.
This is surprising. I've just checked the sorting function and it should handle negative numbers fine. Would you open an issue with a failing file?
This problem we actually found in our data :see_no_evil:
Damn. Okay, apparently that method was disabled by #2077. Use https://github.com/bids-standard/bids-validator/raw/deno-build/bids-validator.js.
The command I used:
docker run -ti --rm -v $PWD:/data:ro denoland/deno deno run --allow-read --allow-env --reload -A https://github.com/bids-standard/bids-validator/raw/deno-build/bids-validator.js /data
The NIFTI header warnings are not there anymore.
We are aware of some of the issues but some seem not right:
*events.tsv
filesAgreed, it looks like the schema needs tightening up to avoid trying to apply sidecar rules for data files to events.tsv
.
@effigies, thank you for all your help. Many things happened in this issue. I'm wandering how to proceed. Feel free to rename this issue to reflect more its contents.
As a summary, from my side I see the following remaining problems:
./phenotype
directory../phenotype/ASR.json
file as microscopy file and reports missing properties. This JSON file is valid. Renaming this file solves the issue.*events.tsv
files.Please let me know if you'd like me to report anything else or open separate issues for any of these problems.
I'm executing BIDS Validator v1.14.8 on a large dataset (~800GB, ~4500 files). The validator reports incorrectly the following errors:
These values are present in the JSON files.
It seems that the validator doesn't consider the JSON files. Is it possible that I reach some limit. It's not memory because the validator executes and finishes.
When I bidsignore half of the subjects, the validation passes.
I'm happy to do more investigation but I'd need to know what.