Open EdithGaspar opened 1 year ago
Hi @EdithGaspar, Could you define specifically what value are you referring to?
I have the MRIQC results from 1000 subjects, but i dont understand how can i choose the cut-off value for choose my best images or that ones to delete
There's no rule of thumb to do this. As we introduced in our MRIQC paper (https://doi.org/10.1371/journal.pone.0184661), you can train a classifier on a subset of your data (that you manually annotate) to then apply it on the remainder of the dataset. The original code for the classifier was moved into the nipreps/mriqc-learn repo.
Perhaps @jaimebarran or @t-sanchez, who have recently worked with mriqc-learn, can give you some insights or share their experience.
Hi @EdithGaspar,
You can use the baseline model https://github.com/nipreps/mriqc-learn as follows: First you have to load it:
from joblib import load
# Load the trained model
model = load("/mriqc_learn/mriqc_learn/data/classifier.joblib") # check your path
And then you can use y_pred = model.predict(your_loaded_dataset)
which will return binary values (cutoff is 0.5), or alternatively, you can use y_scores = model.predict_proba(your_loaded_dataset)[:, 0]
which will return the probabilities for each image to belong to class '0' in this case (negative class = excluded quality). Then you can decide a threshold and get the indices of the values under or over that threshold, for example:
threshold = 0.7
y_pred_idx = (y_scores > threshold).nonzero()[0]
I would recommend you to retrain the model with updated Python libraries (numpy, sklearn, etc.) before getting directly the model from the repo. You can do that following the tutorial https://github.com/nipreps/mriqc-learn/blob/main/docs/notebooks/Tutorial.ipynb, saving the trained model using:
from joblib import dump
dump(model, "/mriqc-learn/mriqc_learn/data/your_new_classifier.joblib")
In addition, you could train the model with your data as long as you have subjective ratings, loading your prepared data using load_dataset
function.
Let me know if you need additional help!
Cheers!
@jaimebarran I was trying to see if I could run the baseline model. A couple of issues:
pip install mriqc-learn
), the classifier.joblib file didn't come with the install (neither did production.py).So I downloaded the raw classifier.joblib file from this repo and added it to where I thought it should be:
C:\Users\Andrew\anaconda3\Lib\site-packages\mriqc_learn\data
from joblib import load
# Load the trained model
model = load(r"C:\Users\Andrew\anaconda3\Lib\site-packages\mriqc_learn\data\classifier.joblib") # check your path
I get the following error. Any ideas?
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[5], [line 3](vscode-notebook-cell:?execution_count=5&line=3)
[1](vscode-notebook-cell:?execution_count=5&line=1) from joblib import load
[2](vscode-notebook-cell:?execution_count=5&line=2) # Load the trained model
----> [3](vscode-notebook-cell:?execution_count=5&line=3) model = load(r"C:\Users\Andrew\anaconda3\Lib\site-packages\mriqc_learn\data\classifier.joblib")
File [c:\Users\Andrew\anaconda3\Lib\site-packages\joblib\numpy_pickle.py:658](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:658), in load(filename, mmap_mode)
[652](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:652) if isinstance(fobj, str):
[653](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:653) # if the returned file object is a string, this means we
[654](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:654) # try to load a pickle file generated with an version of
[655](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:655) # Joblib so we load it with joblib compatibility function.
[656](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:656) return load_compatibility(fobj)
--> [658](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:658) obj = _unpickle(fobj, filename, mmap_mode)
[659](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:659) return obj
File [c:\Users\Andrew\anaconda3\Lib\site-packages\joblib\numpy_pickle.py:577](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:577), in _unpickle(fobj, filename, mmap_mode)
[575](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:575) obj = None
[576](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:576) try:
--> [577](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:577) obj = unpickler.load()
[578](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:578) if unpickler.compat_mode:
[579](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:579) warnings.warn("The file '%s' has been generated with a "
[580](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:580) "joblib version less than 0.10. "
[581](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:581) "Please regenerate this pickle file."
[582](file:///C:/Users/Andrew/anaconda3/Lib/site-packages/joblib/numpy_pickle.py:582) % filename,
...
File sklearn\tree\_tree.pyx:1418, in sklearn.tree._tree._check_node_ndarray()
ValueError: node array from the pickle has an incompatible dtype:
- expected: {'names': ['left_child', 'right_child', 'feature', 'threshold', 'impurity', 'n_node_samples', 'weighted_n_node_samples', 'missing_go_to_left'], 'formats': ['<i8', '<i8', '<i8', '<f8', '<f8', '<i8', '<f8', 'u1'], 'offsets': [0, 8, 16, 24, 32, 40, 48, 56], 'itemsize': 64}
- got : [('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'), ('threshold', '<f8'), ('impurity', '<f8'), ('n_node_samples', '<i8'), ('weighted_n_node_samples', '<f8')]
Hi @andrew-yian-sun !
When I install mriqc-learn (pip install mriqc-learn), the classifier.joblib file didn't come with the install (neither did production.py).
Is this how is should be? @oesteban @celprov
when I try running your first code block to load the model... I get the following error
I see you are using model = load(r"C:\Users\Andrew\anaconda3\Lib\site-packages\mriqc_learn\data\classifier.joblib") # check your path
. The mmap_mode
parameter in joblib load is used to control the memory-mapping behavior of the loaded object. Memory-mapping is a method used to load data into memory more efficiently, which can be useful when working with large datasets.
Here's what each option means:
PS: I didn't install mriqc-learn, I forked the repo and modify it my own way.
Hi @jaimebarran, thanks for the tip - but it seems like either way (forking the repo, trying different options for mmap_mode) result in the same error message. I wonder if it's because the model was created with an older version of joblib but my version is too recent? My version 1.4.0
Hi @andrew-yian-sun,
Yes, it seems from your error code that
The file '%s' has been generated with a joblib version less than 0.10. Please regenerate this pickle file. % filename
You can try to regenerate the .joblib
file with your up-to-date python libraries running /scripts/train_model.py.
You can modify the columns (= IQMs) to drop in /models/production.py/init_pipeline()/pp.DropColumns(...)
. This will regenerate the classifier.joblib
with your libraries. Then you can try to load it to see if it works now.
I was using joblib v1.2.0 and it worked with some warnings. I updated it to v1.4.0 and it worked without warnings.
Cheers!
I have the MRIQC results from 1000 subjects, but i dont understand how can i choose the cut-off value for choose my best images or that ones to delete