bids-standard / pybids

Python tools for querying and manipulating BIDS datasets.
https://bids-standard.github.io/pybids/
MIT License
221 stars 122 forks source link

Missing entities in get_contrasts() #317

Closed effigies closed 5 years ago

effigies commented 5 years ago

For model:

{
  "Name": "ds000030_bart",
  "Description": "model for balloon analog risk task",
  "Input": {
    "task": "bart"
  },
  "Steps": [
    {
      "Level": "run",
      "Transformations": [
        {
          "Name": "Factor",
          "Input": ["trial_type", "action"]
        },
        {
          "Name": "And",
          "Input": ["trial_type.BALOON", "action.ACCEPT"],
          "Output": ["accept"]
        },
        {
          "Name": "And",
          "Input": ["trial_type.BALOON", "action.EXPLODE"],
          "output": ["explode"]
        }
      ],
      "Model": {
        "X": [
          "accept",
          "explode",
          "framewise_displacement",
          "trans_x", "trans_y", "trans_z", "rot_x", "rot_y", "rot_z"
        ]
      },
      "Contrasts": [
        {
          "Name": "accept_vs_explode",
          "ConditionList": ["accept", "explode"],
          "Weights": [1, -1],
          "Type": "t"
        }
      ]
    },
    {
      "Level": "Dataset",
      "AutoContrasts": ["accept_vs_explode"]
    }
  ]
}

The following:

from bids import BIDSLayout, Analysis

bids_dir = '/data/bids/openfmri/ds000030'
preproc_dir = '/data/out/ds000030/derivatives/fmriprep/'
model_file = '/data/bids/openfmri/ds000030/model-example_smdl.json'
participants = ['10159', '10171', '10206']

layout = BIDSLayout(bids_dir, derivatives=preproc_dir, validate=False)
analysis = Analysis(layout=layout, model=model_file)
analysis.steps[1].get_contrasts()

Results in:

[[ContrastInfo(name='accept_vs_explode', weights=   accept_vs_explode
  0                  1, type='t', entities={})]]
effigies commented 5 years ago

The entities should be {'task': 'bart'}.

tyarkoni commented 5 years ago

Thanks, investigating now....

effigies commented 5 years ago

Same thing is happening at the dataset level for:

{
  "Name": "ds000117_face",
  "Description": "Example three-level model",
  "Input": {
    "task": "facerecognition"
  },
  "Steps": [
    { 
      "Level": "run",
      "Transformations": [
        { 
          "Name": "Factor",
          "Input": ["stim_type"]
        },
        { 
          "Name": "Or",
          "Input": ["stim_type.FAMOUS", "stim_type.UNFAMILIAR"],
          "Output": ["real_faces"]
        }
      ],
      "Model": {
        "X": [
          "real_faces",
          "stim_type.SCRAMBLED",
          "framewise_displacement",
          "trans_x", "trans_y", "trans_z", "rot_x", "rot_y", "rot_z"
        ]
      },
      "Contrasts": [
        {
          "Name": "face_vs_scram",
          "ConditionList": ["real_faces", "stim_type.SCRAMBLED"],
          "Weights": [1, -1],
          "Type": "t"
        }
      ]
    },
    {
      "Level": "Subject",
      "AutoContrasts": ["face_vs_scram"]
    },
    {
      "Level": "Dataset",
      "AutoContrasts": ["face_vs_scram"]
    }
  ]
}
tyarkoni commented 5 years ago

Yeah, I just got the point where I can replicate the problem (had to download ds30 and remove all the derivative references). Working on a diagnosis now.

tyarkoni commented 5 years ago

Ah, okay. I'm pretty sure the problem is that currently, the .entities property for a BIDSVariableCollection only includes entities that are constant across all rows and variables (since otherwise they're not really entities of the collection, but of its constituents). But when the covariates are automatically read in from participants.tsv, they get assigned a NaN value for the task entity, which results in task being stripped from the entities dictionary.

Working on a solution now; I need to decide whether to ignore NaN values when making the determination, or come up with some other scheme. Let me know if you have thoughts.

effigies commented 5 years ago

And this only happens at the top level? Because the task entity is present in the lower levels.

tyarkoni commented 5 years ago

I'm guessing that's because at the lower level no confounds are read in that can't be clearly tied to the same task, so there aren't NaN values, hence all task values are the same.

tyarkoni commented 5 years ago

Ah, okay, looks like an easy fix. I was already actually trying to ignore NaN values; I just wasn't doing it properly. PR coming shortly, hopefully.

tyarkoni commented 5 years ago

Feel free to merge as soon as tests pass.