spatialtopology / spacetop-prep

code for transferring data and preprocessing
MIT License
1 stars 3 forks source link

[BUG] Deno validator #111

Open jungheejung opened 2 months ago

jungheejung commented 2 months ago

Which module is this from?

datalad

What is the issue?

Deno validator warning and errors

What was your expected behavior?

Full pass with no errors

How can we reproduce this?

Code: https://github.com/bids-standard/bids-validator/issues/2129

Any additional context?

    [WARNING] TSV_ADDITIONAL_COLUMNS_UNDEFINED A TSV file has extra columns which are not defined in its associated JSON sidecar
        response_label
        /sub-0001/ses-03/func/sub-0001_ses-03_task-shortvideo_acq-mb8_run-01_events.tsv
        /sub-0133/ses-03/func/sub-0133_ses-03_task-shortvideo_acq-mb8_run-01_events.tsv

        97 more files with the same issue

        subtask_type
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-01_events.tsv
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv

        199 more files with the same issue

        event_type
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-01_events.tsv
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv

        196 more files with the same issue

        value
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-01_events.tsv
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractionaltomsaxe_acq-mb8_run-01_events.tsv

        101 more files with the same issue

        response_accuracy
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-01_events.tsv
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv

        196 more files with the same issue

        question
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv
        /sub-0133/ses-04/func/sub-0133_ses-04_task-fractional_acq-mb8_run-02_events.tsv

        47 more files with the same issue

        participant_response
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv
        /sub-0133/ses-04/func/sub-0133_ses-04_task-fractional_acq-mb8_run-02_events.tsv

        46 more files with the same issue

        normative_response
        /sub-0001/ses-04/func/sub-0001_ses-04_task-fractional_acq-mb8_run-02_events.tsv
        /sub-0133/ses-04/func/sub-0133_ses-04_task-fractional_acq-mb8_run-02_events.tsv

        46 more files with the same issue

        button_press
        /sub-0133/ses-04/func/sub-0133_ses-04_task-fractional_acq-mb8_run-01_events.tsv
        /sub-0003/ses-04/func/sub-0003_ses-04_task-fractional_acq-mb8_run-01_events.tsv

        98 more files with the same issue

        cue_location
        /sub-0003/ses-04/func/sub-0003_ses-04_task-fractional_acq-mb8_run-01_events.tsv
        /sub-0009/ses-04/func/sub-0009_ses-04_task-fractional_acq-mb8_run-01_events.tsv

        46 more files with the same issue

        target_location
        /sub-0003/ses-04/func/sub-0003_ses-04_task-fractional_acq-mb8_run-01_events.tsv
        /sub-0009/ses-04/func/sub-0009_ses-04_task-fractional_acq-mb8_run-01_events.tsv

        46 more files with the same issue

        trial_index
        /sub-0003/ses-04/func/sub-0003_ses-04_task-fractional_acq-mb8_run-01_events.tsv
        /sub-0009/ses-04/func/sub-0009_ses-04_task-fractional_acq-mb8_run-01_events.tsv

        46 more files with the same issue

    Please visit https://neurostars.org/search?q=TSV_ADDITIONAL_COLUMNS_UNDEFINED for existing conversations about this issue.

To solve this issue above, I created a code to print list of runs that deviate from standard TR length. https://github.com/spatialtopology/spacetop-prep/blob/88a560c28dd8109df4c75f61a671f9c724f038bf/spacetop_prep/datalad/identify_shorterTR.py Some runs are indeed shorter than expected, due to partial data collection (e.g. participant had issue with trackball, scanner failure etc) Q. What's the best way moving forward? Adding info in scans.tsv? @yarikoptic

sub-0001_ses-02_task-narratives_acq-mb8_run-01_bold.json has a dcmmeta_shape shorter than the standard. Value: 937
sub-0001_ses-02_task-narratives_acq-mb8_run-02_bold.json has a dcmmeta_shape shorter than the standard. Value: 1059
sub-0005_ses-04_task-fractional_acq-mb8_run-01_bold.json has a dcmmeta_shape shorter than the standard. Value: 5
sub-0013_ses-04_task-fractional_acq-mb8_run-01_bold.json has a dcmmeta_shape shorter than the standard. Value: 1234
sub-0055_ses-02_task-narratives_acq-mb8_run-04_bold.json has a dcmmeta_shape shorter than the standard. Value: 1126
sub-0069_ses-02_task-narratives_acq-mb8_run-03_bold.json has a dcmmeta_shape shorter than the standard. Value: 652
jungheejung commented 2 months ago

@Zizhuang-Miao Could I get your help on resolving this HED warning? It's the 3rd issue in the output above

    [WARNING] HED_WARNING The validation on this HED string returned a warning.
        /sub-0001/ses-01/func/sub-0001_ses-01_task-alignvideo_acq-mb8_run-01_events.tsv - WARNING: [UNITS_MISSING] No unit specified. Using "m" as the default - "X-position/45.62". TSV line: 11. (For more information on this HED warning, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#units-missing.)
        /sub-0001/ses-01/func/sub-0001_ses-01_task-alignvideo_acq-mb8_run-01_events.tsv - WARNING: [UNITS_MISSING] No unit specified. Using "m" as the default - "X-position/35.0". TSV line: 12. (For more information on this HED warning, see https://hed-specification.readthedocs.io/en/latest/Appendix_B.html#units-missing.)

        75994 more files with the same issue

    Please visit https://neurostars.org/search?q=HED_WARNING for existing conversations about this issue.

Currently, this is the key and value pair in the task-alignvideo_events.json file

    "response_value": {
        "LongName": "The value of the rating",
        "Description": "This value ranges from 0 ('Barely at all') to 100 ('Strongest imaginable'). Note that if the 'duration' of one rating event was 'n/a', the response value would also be 'n/a'.",
        "HED": "(X-position/#, Agent-action, (Press, Mouse-button, Scroll-wheel))"
    }

Moving forward, it would be nice to validate the HED tags on a HED validator, since running bids-validator on the entire data dataset can be inefficient for debugging purposes.

Zizhuang-Miao commented 2 months ago

@jungheejung I looked into this issue and now I do not think this warning could be elegantly avoided. HED expects a tag that takes values to have a specified unit followed it (https://hed-specification.readthedocs.io/en/latest/03_HED_formats.html#tags-that-take-values; please also see examples in the 3.2.2 section right above it). In our experiments the rating values are either without units (a relative number between 0 and 100) or in the unit of pixels, while pixel is not in the list of allowed units in HED (https://hed-specification.readthedocs.io/en/latest/Appendix_A.html#a-1-1-unit-classes-and-units). I will suggest that we ignore this warning for now.

jungheejung commented 1 month ago

Awesome, appreciate you taking a look into this HED warning @Zizhuang-Miao . In that case, we'll ignore for now.

yarikoptic commented 1 month ago

re 6 -- you say

Some runs are indeed shorter than expected, due to partial data collection (e.g. participant had issue with trackball, scanner failure etc)

but it seems not "scanner failure" since it is events file shorter than data file so data collection was fine. Overall, after you handle it (what about the other 60 ?) -- could be added to ignored I guess

jungheejung commented 1 month ago