Closed mih closed 3 years ago
OK, I'm adding some information here to sketch ideas for how data quality could be represented for the studyforrest dataset. Much of this is influenced by my current and limited understanding of the dataset and its composition, i.e. it will probably change.
The represented data quality measures will depend on the data itself. There are multiple modalities and datatypes from which to extract data quality information. Most of this will come from derived (not raw) data. It is useful to look at which data derivations already exist because that would mean that we wouldn't have to run pipelines on the data (or do we specifically want to have a consolidated data quality pipeline for the full dataset?)
As a first shot it makes sense (to me) to start with standard functional and structural MRI quality metrics that most users would be familiar with. These include framewise displacement (functional) and structural-functional registration overlap (for whole brain, cortical surfaces, and ROIs). It also makes sense to first work from the derived data that are already available.
Currently, I am aware of the following derived / preprocessed data that fit the above description:
gif
outputs from a related QA pipeline (from https://github.com/psychoinformatics-de/studyforrest-data-freesurfer)Here's a screencapture of a Plotly graph of the framewise displacement distributions per subject, each over all 8 runs of the 7T audio movie fMRI acquisition.
I haven't put any time yet into making the graph prettier. Any suggestions for improvements very welcome. The file is standalone HTML (of about 4.5MB) with embedded javascript, as exported from Plotly.
Notes:
sub-10
has no motion parameters in this dataset, does anyone know why? @mih? I don't think it should stop us from having this visualization in the website, though.Do the same as above for the 3T audio-visual movie data (from here: https://github.com/psychoinformatics-de/studyforrest-data-aligned).
I noticed the motion parameters look suspicious. See an extract from studyforrest-data-aligned/sub-01/in_bold3Tp2/sub-01_task-avmovie_run-1_bold_mcparams.txt
below. Typically the first three columns are translations, and the last three are rotations (either in degrees or radians). The values in the first three columns look suspiciously low. This is the same for multiple subjects.
-0.0185102 0.000964457 0.00570813 0.237992 -0.565209 1.38533
-0.016557 0.000730658 0.00541475 0.225521 -0.546841 1.33916
-0.018021 0.00118411 0.00541045 0.233148 -0.53043 1.38916
-0.0168537 0.00115001 0.00524862 0.222841 -0.584751 1.29539
-0.0180707 0.00130329 0.00560456 0.220491 -0.474867 1.34273
I looked in the code
directory of this dataset for some more info, and I found this excerpt from the studyforrest-data-aligned/code/mk_movie_ds.py
file:
mc = np.recfromtxt(
'sub-%.2i/in_%s/sub-%.2i_task-%s_run-%i_bold_mcparams.txt'
% (subj, label, subj, task, seg),
names=('mc_xtrans', 'mc_ytrans', 'mc_ztrans', 'mc_xrot',
'mc_yrot', 'mc_zrot'))
which seems to suggest that these parameters are in the correct order. Any comments?
Look at grabbing some of the existing freesurfer QA snapshots (from https://github.com/psychoinformatics-de/studyforrest-data-freesurfer, see an example snapshot below) and creating an informative example of freesurfer outputs. Perhaps also in montage form, or a movie of images.
Regarding the suspicious motion parameters, I'm going to assume for now that it was created according to the standard in FSL MCFLIRT (which seems to have been used here), which outputs params in the order trans_x, trans_y, trans_z, rot_x, rot_y, rot_z
with translations in mm and rotations in radians.
Actually, no. Seems like MCFLIRT puts the rotations first. This would explain the suspicious looking data.
Here's a screencast for the 3T framewise displacement distributions. In this case data for sub-07
, sub-08
, sub-11
, sub-12
and sub-13
were all missing.
Looks great, all of it!
re the mysterious motion params: I think you last hypothesis is the good one. Range is radians and rotation is first.
- Since we're looking at high-level summary visualizations, I pooled all framewise displacement measures from all 8 runs into one distribution plot per subject. Any interest in seeing it per run?
I have seen multiple analyses that selected individual "good" runs, so it would make sense.
sub-10
has no motion parameters in this dataset, does anyone know why? @mih? I don't think it should stop us from having this visualization in the website, though.
https://www.nature.com/articles/sdata20143/tables/4 Distortion correction did not work.
A 'good' subject, all runs:
And a single 'good' run from the same subject:
Have created this gif from the freesurfer-processed files for subjects 1 through 20. It shows, per subject, snapshots of the white/grey matter, subcortical atlas, and cortical parcellations. Would this be useful to include in the website?
Data quality update done to the explore page with #44 and #46
Maybe it would be straightforward to adopt @jsheunis web-app for this?