Running QC on large dataset with `participant` option

clarkenj commented 1 year ago

In the error log for srpbs I get the following (some lines removed):

Traceback (most recent call last):
  File "/home/nclarke/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1256, in _execute_context
    self.dialect.do_executemany(
  File "/home/nclarke/.local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 605, in do_executemany
    cursor.executemany(statement, parameters)
sqlite3.OperationalError: unable to open database file

The above exception was the direct cause of the following exception:

File "/home/nclarke/.local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 605, in do_executemany
    cursor.executemany(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file
[SQL: INSERT INTO files (path, filename, dirname, is_dir, class_) VALUES (?, ?, ?, ?, ?)]
(Background on this error at: http://sqlalche.me/e/13/e3q8)

This is the output of .out file:

/home/nclarke/scratch/srpbs_fmriprep-20.2.7lts_1691842839/data
Namespace(bids_dir=PosixPath('/home/nclarke/scratch/srpbs_fmriprep-20.2.7lts_1691842839/data/fmriprep-20.2.7lts'), output_dir=PosixPath('/lustre04/scratch/nclarke/srpbs_qc'), analysis_level='participant', participant_label=None, session=None, task=None, quality_control_parameters=None, reindex_bids=False, verbose=1)
Quality control parameters: {'mean_fd': 0.55, 'scrubbing_fd': 0.2, 'proportion_kept': 0.5, 'anatomical_dice': 0.97, 'functional_dice': 0.89}

The path is correct.

The background at the link says "OperationalError. Exception raised for errors that are related to the database’s operation and not necessarily under the control of the programmer, e.g. an unexpected disconnect occurs, the data source name is not found, a transaction could not be processed, a memory allocation error occurred during processing, etc.

This error is a DBAPI Error and originates from the database driver (DBAPI), not SQLAlchemy itself.

The OperationalError is the most common (but not the only) error class used by drivers in the context of the database connection being dropped, or not being able to connect to the database. For tips on how to deal with this, see the section Dealing with Disconnects."

I don't think it is a space issue since I have nearly 20,000GB available. I am thinking to increase --mem (currently 8G), what do you think @htwangtw, have you come across this before?

htwangtw commented 1 year ago

Since it's coming from sqlalchemy, it might be something to do with PyBIDS...

Do you have the trace back to which line in giga_auto_qc is giving you the error? The first thing I will try is use the --reindex_bids flag

htwangtw commented 1 year ago

after some googling, I believe this is a pybids related thing and the solution is (potentially) this one: https://github.com/nipreps/fmriprep/issues/2313#issuecomment-1013680975

AFAIK it's better to

delete the pybids layout sql database that was not built correctly
build the pybids layout first on the full dataset
run the qc with all the default flag.

UKBB didn't have this issue as each subject was treated as an isolated dataset.

If the proposed solution works, do you mind making a PR in the README.md to keep this as note to users?

clarkenj commented 1 year ago

Ah! Interesting. Here is the relevant part of the traceback ( I think):

Traceback (most recent call last):
  File "/home/nclarke/.local/bin/giga_auto_qc", line 8, in <module>
    sys.exit(main())
  File "/lustre03/project/6003287/nclarke/giga_preprocess2/giga_auto_qc/giga_auto_qc/run.py", line 79, in main
    workflow(args)
  File "/lustre03/project/6003287/nclarke/giga_preprocess2/giga_auto_qc/giga_auto_qc/workflow.py", line 47, in workflow
    fmriprep_bids_layout = BIDSLayout(
  File "/home/nclarke/.local/lib/python3.10/site-packages/bids/layout/layout.py", line 154, in __init__
    indexer(self)

Yes I can try that!

clarkenj commented 1 year ago

@htwangtw - Progress, but now a new error:

^M0it [00:00, ?it/s]^M0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/home/nclarke/.local/bin/giga_auto_qc", line 8, in <module>
    sys.exit(main())
  File "/lustre03/project/6003287/nclarke/giga_preprocess2/giga_auto_qc/giga_auto_qc/run.py", line 79, in main
    workflow(args)
  File "/lustre03/project/6003287/nclarke/giga_preprocess2/giga_auto_qc/giga_auto_qc/workflow.py", line 69, in workflow
    anatomical_metrics = assessments.calculate_anat_metrics(
  File "/lustre03/project/6003287/nclarke/giga_preprocess2/giga_auto_qc/giga_auto_qc/assessments.py", line 345, in calculate_anat_metrics
    metrics["anatomical_dice"]
  File "/home/nclarke/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 3760, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/home/nclarke/.local/lib/python3.10/site-packages/pandas/core/indexes/range.py", line 349, in get_loc
    raise KeyError(key)
KeyError: 'anatomical_dice'

This is the output:

/home/nclarke/scratch/srpbs_fmriprep-20.2.7lts_1691842839/data
Namespace(bids_dir=PosixPath('/home/nclarke/scratch/srpbs_fmriprep-20.2.7lts_1691842839/data/fmriprep-20.2.7lts'), output_dir=PosixPath('/lustre04/scratch/nclarke/srpbs_qc'), analysis_level='participant', participant_label=None, session=None, task=None, quality_control_parameters=None, reindex_bids=False, verbose=1)
Quality control parameters: {'mean_fd': 0.55, 'scrubbing_fd': 0.2, 'proportion_kept': 0.5, 'anatomical_dice': 0.97, 'functional_dice': 0.89}
Retrieved anatomical reference mask
Use standard template as functional scan reference.
Calculate the anatomical dice score.

I tried adding some print statements to assessments.py to probe the problem but they didn't print which I'm confused about. Any ideas? Thank you!

htwangtw commented 1 year ago

Yay to the progress! The error is from the pandas dataframe so it might worth looking into that I will try to probe at it too

clarkenj commented 1 year ago

I ran the test on two subjects using this command: pytest -m "not smoke" --doctest-modules -v --pyargs giga_auto_qc. I'm not sure if this is completely correct as this is quite beyond my experience. But this is the output I got, which suggests that it should complete...?

[mii] loading StdEnv/2020 pytest/7.4.0 ...
========================================= test session starts =========================================
platform linux -- Python 3.10.2, pytest-7.4.0, pluggy-1.2.0 -- /cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.10.2/bin/python
cachedir: .pytest_cache
rootdir: /lustre03/project/6003287/nclarke/giga_preprocess2/giga_auto_qc
configfile: pyproject.toml
collected 10 items / 3 deselected / 7 selected                                                        

giga_auto_qc/tests/test_assessments.py::test_quality_accessments PASSED                         [ 14%]
giga_auto_qc/tests/test_assessments.py::test_dice_coefficient PASSED                            [ 28%]
giga_auto_qc/tests/test_assessments.py::test_check_mask_affine PASSED                           [ 42%]
giga_auto_qc/tests/test_assessments.py::test_get_consistent_masks PASSED                        [ 57%]
giga_auto_qc/tests/test_cli.py::test_help PASSED                                                [ 71%]
giga_auto_qc/tests/test_utils.py::test_get_subject_lists PASSED                                 [ 85%]
giga_auto_qc/tests/test_utils.py::test_parse_scan_information PASSED                            [100%]

============================= 7 passed, 3 deselected in 354.09s (0:05:54) =============================

htwangtw commented 1 year ago

Great, let me clarify what I was trying to say: you can modify the code in the test file, run it in an interactive session, so we can find out why it is acting weird on your dataset. Your dataset might have found some edge cases that's not covered currently...

from bids import BIDSLayout
import templateflow
from giga_auto_qc import assessments

bids_dir = "path/to/your/data"
subject_list = ["1"]  # any subject number, remove `sub-`
fmriprep_bids_layout = BIDSLayout(
    root=bids_dir,
    database_path=bids_dir,
    validate=False,
    derivatives=True,
    reset_database=False,  # let's assume you already have a valid bids index
)
template_mask = templateflow.api.get(
    ["MNI152NLin2009cAsym"], desc="brain", suffix="mask", resolution="01"
)
df = assessments.calculate_anat_metrics(
    subject_list,
    fmriprep_bids_layout,
    {"anat": template_mask},
    {"anatomical_dice": 0.97},
)

clarkenj commented 1 year ago

I think that's what I did, except I didn't change reset_database to False... Is that what I did (see below)? Lol. Shall I re-run with reset_database=False?

def test_calculate_anat_metrics():
    bids_dir = resource_filename(
        "giga_auto_qc",
        "/home/nclarke/scratch/srpbs_fmriprep-20.2.7lts_1691842839/data/fmriprep-20.2.7lts",
    )
    fmriprep_bids_layout = BIDSLayout(
        root=bids_dir,
        database_path=bids_dir,
        validate=False,
        derivatives=True,
        reset_database=True,
    )
    template_mask = templateflow.api.get(
        ["MNI152NLin2009cAsym"], desc="brain", suffix="mask", resolution="01"
    )
    df = assessments.calculate_anat_metrics(
        ["0246", "0603"],
        fmriprep_bids_layout,
        {"anat": template_mask},
        {"anatomical_dice": 0.97},
    )
    print(df)

htwangtw commented 1 year ago

Hmmm good to know indexing is not an issue. If indexing fails, it will just not run correctly. My guess is it might be something wrong with one subject? I genuinely have no idea how that can go wrong.

I can think of an very wasteful hack here: Loop through all the subjects, run the BIDS-app at participant level, pass the subject id to --participant-label This way you will get a one-line tsv for each subjects and figure out who is the imposter...

clarkenj commented 1 year ago

I'll give it a go, thank you!!

clarkenj commented 1 year ago

Plot twist, I get the same error with ds00030 and the group flag... will see if the hack works with srpbs and then maybe try that too.

htwangtw commented 1 year ago

We have a version of ds000030 preprocessed. Let me just try it

clarkenj commented 1 year ago

Thanks! The most recent one is the one I preprocessed

SIMEXP / giga_auto_qc

Running QC on large dataset with `participant` option #19