yeatmanlab / pyAFQ

Automated Fiber Quantification ... in Python
http://yeatmanlab.github.io/pyAFQ/
BSD 2-Clause "Simplified" License
53 stars 34 forks source link

Adds ParallelGroupAFQ #1124

Closed teresamg closed 2 months ago

teresamg commented 3 months ago

Creates a ParallelGroupAFQ class inheriting from GroupAFQ to allow for easier parallelization through pydra. Does not currently work; in progress.

pep8speaks commented 3 months ago

Hello @teresamg! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 352:80: E501 line too long (86 > 79 characters) Line 1012:1: W293 blank line contains whitespace

Comment last updated at 2024-05-15 23:45:53 UTC
arokem commented 3 months ago

The code looks quite neat now!

@36000 : could you take a quick look when you get a chance?

Any ideas about how we can test this? I guess that we'd have to set this up as a nightly test, possibly running this parallelized with concurrent futures across two HBN subjects?

36000 commented 3 months ago

my thoughts, I will implement these:

  1. add a similar looking export function to wrap export like the export_all implemented here
  2. add pydra to setup.cfg
  3. add a nighly test on 2 hbn subjects using concurrent futures for pydra
36000 commented 2 months ago

I made some minor changes here. I made pydra a required library for pyAFQ (its pip installable with few dependencies, this will just be easier for most users). We now try to catch that error Ariel mentioned (let me know if this doesnt work). And I added an export function based on the export_all function. Next I will add the nightly test then I think we can merge this

36000 commented 2 months ago

I wrote this test:

def test_AFQ_pydra():
    _, bids_path = afd.fetch_hbn_preproc(["NDARAA948VFH", "NDARAV554TP2"])
    pga = ParallelGroupAFQ(bids_path, preproc_pipeline="qsiprep")
    pga.export_all()

and i am getting this error:

FAILED AFQ/tests/test_api.py::test_AFQ_pydra - ValueError: Split is missing values for the following fields ['pAFQ_kwargs']

Either of you two know what causes this?

arokem commented 2 months ago

Could you please push that test to this PR? I'm not sure what's up and would like to try to debug locally.

36000 commented 2 months ago

Sure, I figured out this was due to me using a different pydra version (the latest, 0.23). But now I am getting picking problems. I will push some of the changes I have made, but I think maybe we will have to talk about this in person at some point. I am running into a few issues

36000 commented 2 months ago

@teresamg mind trying this to see if it works? I factored out bidslayout

teresamg commented 2 months ago

Looks like it ran successfully, both locally and on Hyak!

arokem commented 2 months ago

Did it generate the combined tract profiles file? I also ran a test on hyak:

import os.path as op

from AFQ.api.group import ParallelGroupAFQ
from AFQ.definitions.image import RoiImage
import AFQ.api.bundle_dict as abd
import AFQ.data.fetch as afd

_, bids_path = afd.fetch_hbn_preproc(
        ["NDARZT957CWG",
         "NDARZU279XR3",
         "NDARZU401RCU",
         "NDARZU822WN3",
         "NDARZV421TCZ",
         "NDARZW262ZLV",
         "NDARZX163EWC",
         ],
        path="/gscratch/scrubbed/arokem/data/")

my_afq = ParallelGroupAFQ(
    bids_path=bids_path,
    preproc_pipeline="qsiprep",
    parallel_params={
        "submitter_params": {
            "plugin": "slurm",
            "sbatch_args": "-J test \
                            -p ckpt \
                            --nodes=1 \
                            --cpus-per-task=8 \
                            --gpus=1 \
                            -A escience \
                            --mem=64G \
                            --time=2:00:00 \
                            -o /gscratch/scrubbed/arokem/logs/test.out \
                            -e /gscratch/scrubbed/arokem/logs/test.err \
                            --mail-user=arokem@uw.edu \
                            --mail-type=ALL"
        },
        "cache_dir": "/gscratch/scrubbed/arokem/tmp"
    }
)

my_afq.export_all()

which worked great at individual subject level, but I can't find the final combined tract profiles file, which I was expecting to find.

teresamg commented 2 months ago

It did for me... does each subject have the *_desc-profiles_dwi.csv?

arokem commented 2 months ago

I don't know if this is directly related to the content of this PR, but I am now getting an error in visualizing the standard set of bundles, where the visualization code is raising:

INFO:AFQ:Generating colorful lines from tractography...
KeyError: 'Forceps Minor'

Presumably because it's looking for a tract that is now no longer part of the default set of tracts (because we're using the more granular set of CC tracts).

@36000 : any chance this is related to changes you introduced here to how data is passed between GroupAFQ and ParticipantAFQ?

If you think this is unrelated, I think we can probably merge this PR, and fix this issue elsewhere.

36000 commented 2 months ago

Not sure what would cause this, but I think it is unrelated, as removing overlapping bundle definitions happens in the init method of the BundleDict, so I don't think a race condition is causing this error.