CoBrALab / MAGeTbrain

Multiple Automatically Generated Templates brain segmentation algorithm
Other
48 stars 25 forks source link

ValueError: could not convert string to float: cant read file #59

Closed pohankung closed 1 year ago

pohankung commented 1 year ago

Dear Dr. Devenyi,

Hope you are well.

Our team is trialling segmentation of the habenula using MAGeT and has encountered the below error when running mb run -q parallel:

+++ PrintHeader /path/to/segmentation/project/folder/input/templates/brains/brain104_t1.mnc 1 /usr/local/easybuild-2019/easybuild/build/ANTs/2.4.3/foss-2020b/easybuild_obj/ITKv5/Modules/ThirdParty/MINC/src/libminc/libsrc2/volume.c:1403 (from MINC): Unable to open file '/path/to/segmentation/project/folder/input/templates/brains/brain104_t1.mnc' ++ python -c 'print(min([abs(x) for x in [float(x) for x in " cant read /path/to/segmentation/project/folder/input/templates/brains/brain104_t1.mnc".split("x")]]))' Traceback (most recent call last): File "", line 1, in File "", line 1, in ValueError: could not convert string to float: ' cant read /path/to/segmentation/project/folder/input/templates/brains/brain104_3D_e1.mnc'

Following your suggestions on a similar error on another thread CoBrALab/MAGeTDocker#7, we've investigated the possibility of corrupted files and tested the PrintHeader and mincinfo output. For context, our files are originally in Nifti-1 format, skull-stripped and bias-corrected using SPM12 before converting to MINC using nii2mnc (minc-toolkit/minc-toolkit/1.9.18.2). Command line history quoted below:

(MAGeT) [tkung@spartan-bm084 brains]$ mincinfo brain104_t1.mnc file: brain104_t1.mnc image: signed__ short -32768 to 32767 image dimensions: zspace yspace xspace dimension name length step start


zspace                    224     0.749999     -75.4294
yspace                    320         0.75      -104.36
xspace                    320         0.75     -115.991

(MAGeT) [tkung@spartan-bm084 brains]$ PrintHeader brain104_t1.mnc /usr/local/easybuild-2019/easybuild/build/ANTs/2.4.3/foss-2020b/easybuild_obj/ITKv5/Modules/ThirdParty/MINC/src/libminc/libsrc2/volume.c:1403 (from MINC): Unable to open file 'brain104_t1.mnc' cant read brain104_t1.mnc

I've tested the two functions with the T1 images supplied on your atlas page and had no issue. See below:

(MAGeT) [tkung@spartan-bm084 brains]$ pwd /path/to/segmentation/project/folder/input/atlases/brains (MAGeT) [tkung@spartan-bm084 brains]$ mincinfo brain1_t1.mnc file: brain1_t1.mnc image: signed__ short -32768 to 32767 image dimensions: xspace zspace yspace dimension name length step start


xspace                    489          0.3     -75.4061
zspace                    503          0.3     -20.8617
yspace                    734          0.3     -80.3497

(MAGeT) [tkung@spartan-bm084 brains]$ PrintHeader brain1_t1.mnc Spacing [0.3, 0.3, 0.3] Origin [-75.4061, -80.3497, -20.8617] Direction 1 0 0 0 1 0 0 0 1

Size : 489 734 503

Image Dimensions : [489, 734, 503] Bounding Box : {[-75.4061 -80.3497 -20.8617], [71.2939 139.85 130.038]} Voxel Spacing : [0.3, 0.3, 0.3] Intensity Range : [-21.7507, 9418.37] Mean Intensity : 1131.89 Direction Cos Mtx. : 1 0 0 0 1 0 0 0 1

Voxel->RAS x-form : Image Metadata: ITK_original_direction of unsupported type N3itk6MatrixIdLj3ELj3EEE ITK_original_spacing of unsupported type St6vectorIdSaIdEE dicom_0×0010:el_0×0010 = anonymous dimension_order = +X+Z+Y patient:full_name = anonymous patient:varid = MINC standard variable patient:vartype = group____ patient:version = MINC Version 1.0 storage_data_type = s

Given your expertise on the inner workings of PrintHeader and how it's being used in MAGeT, I wonder if you have any suggestion on the cause of these issues and how I can resolve them.

Thank you in advance for your guidance. Please let me know if more information is required.

Many thanks, Terry

gdevenyi commented 1 year ago

Hi,

My suspicion here is that you have MINC1 files, which ITK-based tools cannot read.

You can confirm this with file brain104_t1.mnc, which will say "NetCDF" for MINC1, and "HDF5" for MINC2.

If you have MINC1, use mincconvert -2 input.mnc output.mnc to fix.

pohankung commented 1 year ago

Dear Dr. Devenyi,

Thank you very much for the pointer. After converting the files to MINC2, the PrintHeader function is performing as expected and I was able to run the mb run -q parallel command.

With that said, I encountered the following output that looked alarming to me, in particular the NOMASK messages and that the mb_register_labelmask_antsReg_allANTs function was killed. I was wondering if you have any suggestions on how to resolve this. Thank you in advance and please let me know if you would like me to open a separate thread for this follow-up issue.

(MAGeT) [tkung@spartan-bm083 Hb_segmentation]$ mb run -q parallel 5 atlases, 21 templates, 49 subjects found /data/scratch/projects/punim1471/MAGeTbrain/bin/mb:441: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead logger.warn(message) 'vanilla' mode only works for the 'qbatch' queue. Other queues treat this as stage as 'register'. You will need to run the 'voting' stage separately. Academic tradition requires you to cite works you base your article on. When using programs that use GNU Parallel to process data for publication please cite:

O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47.

This helps funding further development; AND IT WON'T COST YOU A CENT. If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

To silence the citation notice: run 'parallel --bibtex'.

Reading Volume: ............................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Outputting Volume: ............................................................ Reading Volume: ............................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Dilation: ..................................................................... Outputting Volume: ............................................................ All_Command_lines_OK Using double precision for computations.

The composite transform comprises the following transforms (in order):

  1. Center of mass alignment using fixed image: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc and moving i mage: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc (type = Euler3DTransform)

    Reading mask(s). Registration stage 0 No fixed mask No moving mask Registration stage 1 No fixed mask No moving mask Registration stage 2 No fixed mask No moving mask Registration stage 3 No fixed mask Moving mask = /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc Registration stage 4 No fixed mask Moving mask = /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc Registration stage 5 No fixed mask Moving mask = /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask_fine.mnc number of levels = 12 number of levels = 11 number of levels = 10 number of levels = 10 number of levels = 6 number of levels = 6 fixed image: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc moving image: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc fixed image: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc moving image: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc fixed image: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc moving image: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc fixed image: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc moving image: /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc ++ mktemp -d

    • tmpdir=/tmp/tmp.gfaeda0op2
    • atlas=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc
    • target=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc
    • output_xfm=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/nl.xfm ++ basename /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc .mnc
    • atlas_stem=brain1_t1 ++ basename /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc .mnc
    • target_stem=brain104_t1 +++ dirname /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc ++ dirname /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains
    • atlas_labels=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/labels/brain1_t1_labels.mnc ++ dirname /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/nl.xfm
    • output_dir=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1
    • AT_lin_xfm=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/ATlin.xfm
    • TA_lin_xfm=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/ATlin_inverse.xfm
    • TA_nl_xfm=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/TAnl
    • AT_nl_xfm=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/TAnl1_inverse_NL.xfm
    • atlas_label_mask=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc
    • atlas_label_mask_close=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask_fine.mnc
    • atlas_res_label_mask=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmasklinres.mnc +++ PrintHeader /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc 1 ++ python -c 'print(min([abs(x) for x in [float(x) for x in "0.75x0.75x0.749999".split("x")]]))'
    • fixed_minimum_resolution=0.749999 +++ PrintHeader /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc 1 +++ PrintHeader /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc 2 ++ python -c 'print(max([ a*b for a,b in zip([abs(x) for x in [float(x) for x in "0.75x0.75x0.749999".split("x")]],[abs(x) for x in [float(x) for x in "320x320x224".split("x")]])]))'
    • fixed_maximum_resolution=240.0
    • [[ -s /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/labels/brain1_t1_labels.mnc ]]
    • [[ ! -s /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc ]] +++ PrintHeader /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/labels/brain1_t1_labels.mnc 1 ++ python -c 'print(min([abs(x) for x in [float(x) for x in "0.3x0.3x0.3".split("x")]]))'
    • moving_minimum_resolution=0.3 ++ awk -vORS= '{print "D"}' +++ calc 'int(3/0.3+0.5)' +++ awk 'BEGIN { print int(3/0.3+0.5) }' ++ seq 10
    • mincmorph -3D26 -clobber -successive 'B[0.5:inf:1:0]DDDDDDDDDD' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/labels/brain1_t1_labels.mnc /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc ++ awk -vORS= '{print "D"}' +++ calc 'int(1.5/0.3+0.5)' +++ awk 'BEGIN { print int(1.5/0.3+0.5) }' ++ seq 5
    • mincmorph -3D26 -clobber -successive 'B[0.5:inf:1:0]DDDDD' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/labels/brain1_t1_labels.mnc /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask_fine.mnc
    • fixedmask=NOMASK
    • movingmask=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc
    • fixedmaskfine=NOMASK
    • movingmaskfine=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask_fine.mnc
    • [[ ! -s /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/ATlin.xfm ]]
    • [[ ! -s /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/ATlin_inverse.xfm ]]
    • fixedfile=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc
    • movingfile=/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc ++ mb_ants_generate_iterations.py --min 0.749999 --max 240.0 --output multilevel-halving --convergence 1e-7
    • affinesteps='--transform Translation[ 0.5 ] \ --metric Mattes[ ${fixedfile},${movingfile},1,32,None ] \ --convergence [ 500x500x500x500x500x500x500x500x500x500x500x500,1e-7,10 ] \ --shrink-factors 10x10x10x10x10x10x10x10x10x10x10x10 \ --smoothing-sigmas 7.3184636752729x6.999653238640289x6.680812735381885x6.361937644962326x6.043022492151717x5.724060580744148x5.4050436328045715x5.085961291843281x4.766800425951337x4.44754413004371x4.128170263644341x3.808649250285888mm \ --masks [ NOMASK,NOMASK ] \ --transform Rigid[ 0.33437015 ] \ --metric Mattes[ ${fixedfile},${movingfile},1,32,None ] \ --convergence [ 500x500x500x500x500x500x500x500x500x500x500,1e-7,10 ] \ --shrink-factors 10x10x10x10x10x10x10x10x9x8x7 \ --smoothing-sigmas 5.4050436328045715x5.085961291843281x4.766800425951337x4.44754413004371x4.128170263644341x3.808649250285888x3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054mm \ --masks [ NOMASK,NOMASK ] \ --transform Similarity[ 0.2236068 ] \ --metric Mattes[ ${fixedfile},${movingfile},1,32,None ] \ --metric GC[ ${fixedfile},${movingfile},1,NA,None ] \ --convergence [ 500x500x500x500x500x500x500x500x450x150,1e-7,10 ] \ --shrink-factors 10x10x9x8x7x6x5x4x3x2 \ --smoothing-sigmas 3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054x1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635mm \ --masks [ NOMASK,NOMASK ] \ --transform Similarity[ 0.14953488 ] \ --metric Mattes[ ${fixedfile},${movingfile},1,32,None ] \ --metric GC[ ${fixedfile},${movingfile},1,NA,None ] \ --convergence [ 500x500x500x500x500x500x500x500x450x150,1e-7,10 ] \ --shrink-factors 10x10x9x8x7x6x5x4x3x2 \ --smoothing-sigmas 3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054x1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635mm \ --masks [ ${fixedmask},${movingmask} ] \ --transform Affine[ 0.1 ] \ --metric Mattes[ ${fixedfile},${movingfile},1,64,None ] \ --metric GC[ ${fixedfile},${movingfile},1,NA,None ] \ --convergence [ 500x500x500x450x150x50,1e-7,10 ] \ --shrink-factors 6x5x4x3x2x1 \ --smoothing-sigmas 1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635x0.0mm \ --masks [ ${fixedmask},${movingmask} ] \ --transform Affine[ 0.1 ] \ --metric Mattes[ ${fixedfile},${movingfile},1,64,None ] \ --metric GC[ ${fixedfile},${movingfile},1,NA,None ] \ --convergence [ 500x500x500x450x150x50,1e-7,10 ] \ --shrink-factors 6x5x4x3x2x1 \ --smoothing-sigmas 1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635x0.0mm \ --masks [ ${fixedmaskfine},${movingmaskfine} ] ' ++ eval echo --transform 'Translation[' 0.5 ']' '\' --metric 'Mattes[' '${fixedfile},${movingfile},1,32,None' ']' '\' --convergence '[' 500x500x500x500x500x500x500x500x500x500x500x500,1e-7,10 ']' '\' --shrink-factors 10x10x10x10x10x10x10x10x10x10x10x10 '\' --smoothing-sigmas 7.3184636752729x6.999653238640289x6.680812735381885x6.361937644962326x6.043022492151717x5.724060580744148x5.4050436328045715x5.085961291843281x4.766800425951337x4.44754413004371x4.128170263644341x3.808649250285888mm '\' --masks '[' NOMASK,NOMASK ']' '\' --transform 'Rigid[' 0.33437015 ']' '\' --metric 'Mattes[' '${fixedfile},${movingfile},1,32,None' ']' '\' --convergence '[' 500x500x500x500x500x500x500x500x500x500x500,1e-7,10 ']' '\' --shrink-factors 10x10x10x10x10x10x10x10x9x8x7 '\' --smoothing-sigmas 5.4050436328045715x5.085961291843281x4.766800425951337x4.44754413004371x4.128170263644341x3.808649250285888x3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054mm '\' --masks '[' NOMASK,NOMASK ']' '\' --transform 'Similarity[' 0.2236068 ']' '\' --metric 'Mattes[' '${fixedfile},${movingfile},1,32,None' ']' '\' --metric 'GC[' '${fixedfile},${movingfile},1,NA,None' ']' '\' --convergence '[' 500x500x500x500x500x500x500x500x450x150,1e-7,10 ']' '\' --shrink-factors 10x10x9x8x7x6x5x4x3x2 '\' --smoothing-sigmas 3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054x1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635mm '\' --masks '[' NOMASK,NOMASK ']' '\' --transform 'Similarity[' 0.14953488 ']' '\' --metric 'Mattes[' '${fixedfile},${movingfile},1,32,None' ']' '\' --metric 'GC[' '${fixedfile},${movingfile},1,NA,None' ']' '\' --convergence '[' 500x500x500x500x500x500x500x500x450x150,1e-7,10 ']' '\' --shrink-factors 10x10x9x8x7x6x5x4x3x2 '\' --smoothing-sigmas 3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054x1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635mm '\' --masks '[' '${fixedmask},${movingmask}' ']' '\' --transform 'Affine[' 0.1 ']' '\' --metric 'Mattes[' '${fixedfile},${movingfile},1,64,None' ']' '\' --metric 'GC[' '${fixedfile},${movingfile},1,NA,None' ']' '\' --convergence '[' 500x500x500x450x150x50,1e-7,10 ']' '\' --shrink-factors 6x5x4x3x2x1 '\' --smoothing-sigmas 1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635x0.0mm '\' --masks '[' '${fixedmask},${movingmask}' ']' '\' --transform 'Affine[' 0.1 ']' '\' --metric 'Mattes[' '${fixedfile},${movingfile},1,64,None' ']' '\' --metric 'GC[' '${fixedfile},${movingfile},1,NA,None' ']' '\' --convergence '[' 500x500x500x450x150x50,1e-7,10 ']' '\' --shrink-factors 6x5x4x3x2x1 '\' --smoothing-sigmas 1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635x0.0mm '\' --masks '[' '${fixedmaskfine},${movingmaskfine}' ']' +++ echo --transform 'Translation[' 0.5 ']' ' --metric' 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,32,None ']' ' --convergence' '[' 500x500x500x500x500x500x500x500x500x500x500x500,1e-7,10 ']' ' --shrink-factors' 10x10x10x10x10x10x10x10x10x10x10x10 ' --smoothing-sigmas' 7.3184636752729x6.999653238640289x6.680812735381885x6.361937644962326x6.043022492151717x5.724060580744148x5.4050436328045715x5.085961291843281x4.766800425951337x4.44754413004371x4.128170263644341x3.808649250285888mm ' --masks' '[' NOMASK,NOMASK ']' ' --transform' 'Rigid[' 0.33437015 ']' ' --metric' 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,32,None ']' ' --convergence' '[' 500x500x500x500x500x500x500x500x500x500x500,1e-7,10 ']' ' --shrink-factors' 10x10x10x10x10x10x10x10x9x8x7 ' --smoothing-sigmas' 5.4050436328045715x5.085961291843281x4.766800425951337x4.44754413004371x4.128170263644341x3.808649250285888x3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054mm ' --masks' '[' NOMASK,NOMASK ']' ' --transform' 'Similarity[' 0.2236068 ']' ' --metric' 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,32,None ']' ' --metric' 'GC[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,NA,None ']' ' --convergence' '[' 500x500x500x500x500x500x500x500x450x150,1e-7,10 ']' ' --shrink-factors' 10x10x9x8x7x6x5x4x3x2 ' --smoothing-sigmas' 3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054x1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635mm ' --masks' '[' NOMASK,NOMASK ']' ' --transform' 'Similarity[' 0.14953488 ']' ' --metric' 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,32,None ']' ' --metric' 'GC[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,NA,None ']' ' --convergence' '[' 500x500x500x500x500x500x500x500x450x150,1e-7,10 ']' ' --shrink-factors' 10x10x9x8x7x6x5x4x3x2 ' --smoothing-sigmas' 3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054x1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635mm ' --masks' '[' NOMASK,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc ']' ' --transform' 'Affine[' 0.1 ']' ' --metric' 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,64,None ']' ' --metric' 'GC[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,NA,None ']' ' --convergence' '[' 500x500x500x450x150x50,1e-7,10 ']' ' --shrink-factors' 6x5x4x3x2x1 ' --smoothing-sigmas' 1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635x0.0mm ' --masks' '[' NOMASK,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc ']' ' --transform' 'Affine[' 0.1 ']' ' --metric' 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,64,None ']' ' --metric' 'GC[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,NA,None ']' ' --convergence' '[' 500x500x500x450x150x50,1e-7,10 ']' ' --shrink-factors' 6x5x4x3x2x1 ' --smoothing-sigmas' 1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635x0.0mm ' --masks' '[' NOMASK,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask_fine.mnc ']'
    • antsRegistration --dimensionality 3 --verbose --minc --output '[' /tmp/tmp.gfaeda0op2/reg ']' --use-histogram-matching 0 --initial-moving-transform '[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1 ']' --transform 'Translation[' 0.5 ']' --metric 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,32,None ']' --convergence '[' 500x500x500x500x500x500x500x500x500x500x500x500,1e-7,10 ']' --shrink-factors 10x10x10x10x10x10x10x10x10x10x10x10 --smoothing-sigmas 7.3184636752729x6.999653238640289x6.680812735381885x6.361937644962326x6.043022492151717x5.724060580744148x5.4050436328045715x5.085961291843281x4.766800425951337x4.44754413004371x4.128170263644341x3.808649250285888mm --masks '[' NOMASK,NOMASK ']' --transform 'Rigid[' 0.33437015 ']' --metric 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,32,None ']' --convergence '[' 500x500x500x500x500x500x500x500x500x500x500,1e-7,10 ']' --shrink-factors 10x10x10x10x10x10x10x10x9x8x7 --smoothing-sigmas 5.4050436328045715x5.085961291843281x4.766800425951337x4.44754413004371x4.128170263644341x3.808649250285888x3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054mm --masks '[' NOMASK,NOMASK ']' --transform 'Similarity[' 0.2236068 ']' --metric 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,32,None ']' --metric 'GC[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,NA,None ']' --convergence '[' 500x500x500x500x500x500x500x500x450x150,1e-7,10 ']' --shrink-factors 10x10x9x8x7x6x5x4x3x2 --smoothing-sigmas 3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054x1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635mm --masks '[' NOMASK,NOMASK ']' --transform 'Similarity[' 0.14953488 ']' --metric 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,32,None ']' --metric 'GC[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,NA,None ']' --convergence '[' 500x500x500x500x500x500x500x500x450x150,1e-7,10 ']' --shrink-factors 10x10x9x8x7x6x5x4x3x2 --smoothing-sigmas 3.488940662562757x3.1689877297299804x2.848708122042206x2.5279776793148354x2.206599822975054x1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635mm --masks '[' NOMASK,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc ']' --transform 'Affine[' 0.1 ']' --metric 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,64,None ']' --metric 'GC[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,NA,None ']' --convergence '[' 500x500x500x450x150x50,1e-7,10 ']' --shrink-factors 6x5x4x3x2x1 --smoothing-sigmas 1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635x0.0mm --masks '[' NOMASK,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask.mnc ']' --transform 'Affine[' 0.1 ']' --metric 'Mattes[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,64,None ']' --metric 'GC[' /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/templates/brains/brain104_t1.mnc,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/input/atlases/brains/brain1_t1.mnc,1,NA,None ']' --convergence '[' 500x500x500x450x150x50,1e-7,10 ']' --shrink-factors 6x5x4x3x2x1 --smoothing-sigmas 1.8842433121833788x1.5603016981906959x1.2335268008278057x0.9008406054674286x0.5516499557437635x0.0mm --masks '[' NOMASK,/data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation/output/registrations/brain1_t1/brain104_t1/labelmask_fine.mnc ']' file NOMASK does not exist . file NOMASK does not exist . file NOMASK does not exist . file NOMASK does not exist . file NOMASK does not exist . file NOMASK does not exist . file NOMASK does not exist . file NOMASK does not exist . file NOMASK does not exist . /data/scratch/projects/punim1471/MAGeTbrain/bin/mb_register_labelmask_antsReg_allANTs: line 83: 167329 Killed antsRegistration --dimensionality 3 --verbose --minc --output [ ${tmpdir}/reg ] --use-histogram-matching 0 --initial-moving-transform [ ${target},${atlas},1 ] $(eval echo ${affinesteps})

Many thanks, Terry

gdevenyi commented 1 year ago

in particular the NOMASK messages

This is expected, the ANTs developers misused their file-reading code to handle the mask reading, resulting in these warnings whenever I want to specify no mask for a stage, I give a non-existent "filename" which throws a warning.

function was killed.

You're overloading the machine and the commands are being killed due to out-of-memory condition (check dmesg). You will need to adjust the parallelism options (see -j N, --processes N Number of processes to parallelize over)

pohankung commented 1 year ago

Hi Dr. Devenyi,

Thank you for the clarification. I was able to review the job status and confirm that the jobs did exceed the available RAM on my testing machine. I've moved the analysis to the available cluster.

I was wondering if you have recommendations on the suitable number of processes to parallelize over? When configuring qbatch with the below settings, I can see that the MAGeT commands are submitted as 14 jobs for the mb_templatelib stage, and 25 jobs for the voting stage. Each job has access to 16-CPUs with a total RAM of 256GB per job, with capacity to increase PPJ/RAM as needed.

Environment variables to customize defaults for local system

export QBATCH_PPJ=16 # requested processors per job export QBATCH_CHUNKSIZE=$QBATCH_PPJ # commands to run per job export QBATCH_CORES=$QBATCH_PPJ # commonds to run in parallel per job export QBATCH_NODES=1 # number of compute nodes to request for the job, typically for MPI jobs export QBATCH_MEM="256GB" # requested memory per job export QBATCH_MEMVARS="mem" # memory request variable to set export QBATCH_SYSTEM="slurm" # queuing system to use ("pbs", "sge","slurm", or "local") export QBATCH_QUEUE="physical" # Name of submission queue export QBATCH_OPTIONS="" # Arbitrary cluster options to embed in all jobs export QBATCH_SCRIPT_FOLDER=".qbatch/" # Location to generate jobfiles for submission export QBATCH_SHELL="/bin/sh" # Shell to use to evaluate jobfile

Execute mb run in command line (calling on qbatch to submit commands)

cd /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation mb run --clobber --stage-templatelib-walltime "10:00:00"

Your advice on how to improve the configuration would be greatly appreciated.

Many thanks, Terry

gdevenyi commented 1 year ago

Hi,

Your command was -q parallel which means no jobs were submitted, everything was being run locally. Removing that should submit jobs, your nodes are more than powerful enough to handle the defaults.

pohankung commented 1 year ago

Hi Dr. Devenyi,

Thank you for the pointers. When we attempted to run the entire pipeline using mb run, the voting jobs timed out on multiple occasions with wall times as long as 30hrs using the below configuration. As you can see, we have limited the number of threads each job can process in parallel, as we observed that MAGeT/qbatch seemed to be using up more resources on each node than expected (i.e., occupying the max threads available per node despite explicit cofig. to only call on 8 CPUs).

export QBATCH_PPJ=8 # requested processors per job export QBATCH_CHUNKSIZE=$QBATCH_PPJ # commands to run per job export QBATCH_CORES=$QBATCH_PPJ # commonds to run in parallel per job export QBATCH_NODES=1 # number of compute nodes to request for the job, typically for MPI jobs export QBATCH_MEM="256G" # requested memory per job export QBATCH_MEMVARS="mem" # memory request variable to set export QBATCH_SYSTEM="slurm" # queuing system to use ("pbs", "sge","slurm", or "local") export QBATCH_QUEUE="physical" # Name of submission queue export QBATCH_OPTIONS="" # Arbitrary cluster options to embed in all jobs export QBATCH_SCRIPT_FOLDER=".qbatch/" # Location to generate jobfiles for submission export QBATCH_SHELL="/bin/sh" # Shell to use to evaluate jobfile

Limit number of threads antsRegistration uses

export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=2

Execute mb run in command line (calling on qbatch to submit commands)

cd /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation mb run --clobber -j 2 --stage-templatelib-walltime "8:30:00" --stage-voting-walltime "30:00:00"

Following from your previous point about -q parallel, I've copied an example of the commands qbatch passed to our cluster after executing the above quoted config in a shell script. I am not sure if this may be causing the jobs to run on local only, and therefore contributing to the time-out?

/data/scratch/projects/punim1471/MAGeTbrain/bin/mb run vote -s brain115_t1 --clobber -j 2 --stage-templatelib-walltime 10:00:00 --stage-voting-walltime 30:00:00 -q parallel

For context, we are attempting MAGeT with 5 atlases, 21 templates, 50 subjects. Your feedback on our configuration or possible solutions would be greatly appreciated.

Please also let me know if you would like more information. Thank you for your time and support.

Warm Regards, Terry

gdevenyi commented 1 year ago

Some info that might be helpful here:

From the MAGeT help:

Execution options:
  -q {parallel,qbatch,files}, --queue {parallel,qbatch,files}
                        Queueing method to use
  -n                    Dry run. Show what would happen.
  -j N, --processes N   Number of processes to parallelize over.
  --clobber             Overwrite output if it exists
  --stage-templatelib-walltime <walltime>
                        Walltime for jobs submitted to build the template library.
  --stage-templatelib-procs <procs>
                        Number of processes to run per node when building the template library.
  --stage-voting-walltime <walltime>
                        Walltime for jobs submitted to do label fusion.
  --stage-voting-procs <procs>
                        Number of processes to run per node when doing label fusion.

--stage-templatelib-procs and --stage-voting-procs determine how many commands are packed into a single job, while --processes determines how many run in parallel. Setting these all to 1 will split things up as much as possible. The default configuration was for a SLURM cluster with 1 job = 1 node, which is why jobs are packed up to saturate nodes and be a good citizen.

It may be worthwhile explaining your SLURM cluster completely in terms of config and features so I can recommend a default qbatch config that plays nice, as it also interacts with MAGeTbrain.

pohankung commented 1 year ago

Hi Dr. Devenyi,

Thank you for the additional pointers. I am trying to get in touch with the managing team to obtain more information on the SLURM cluster config/features, and will be in touch again shortly.

Apologies for the delayed response and thank you very much for your assistance thus far.

Warm Regards, Terry

pohankung commented 1 year ago

Hi Dr. Devenyi,

Thank you for your patience. The SLURM cluster I am currently accessing has 72 cores per node. 14 nodes with 1519 GB memory, and 79 nodes with 710GB memory.

The most recent shell script I used to submit the MAGeT jobs to our cluster is quoted below, which timed out. Your advice on how to configure the job would be very much appreciated.

!/bin/bash

Load required modules

module load foss/2020b gcccore/10.2.0 python/3.8.6 minc-toolkit/1.9.18.2 ants/2.4.3 parallel/20210322

Activate MAGeT

source /data/scratch/projects/punim1471/MAGeTbrain/bin/activate

Environment variables to customize defaults for local system

export QBATCH_PPJ=8 # requested processors per job export QBATCH_CHUNKSIZE=$QBATCH_PPJ # commands to run per job export QBATCH_CORES=$QBATCH_PPJ # commonds to run in parallel per job export QBATCH_NODES=1 # number of compute nodes to request for the job, typically for MPI jobs export QBATCH_MEM="256G" # requested memory per job export QBATCH_MEMVARS="mem" # memory request variable to set export QBATCH_SYSTEM="slurm" # queuing system to use ("pbs", "sge","slurm", or "local") export QBATCH_QUEUE="physical" # Name of submission queue export QBATCH_OPTIONS="" # Arbitrary cluster options to embed in all jobs export QBATCH_SCRIPT_FOLDER=".qbatch/" # Location to generate jobfiles for submission export QBATCH_SHELL="/bin/sh" # Shell to use to evaluate jobfile

Limit number of threads antsRegistration uses to prevent MAGeT from occupying all of the 72 nodes

export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=2

Execute mb run in command line (calling on qbatch to submit commands)

cd /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation mb run --clobber -j 2 --stage-templatelib-walltime "8:30:00" --stage-voting-walltime "48:00:00"

Please also let me know if you require further information.

Many thanks, Terry

gdevenyi commented 1 year ago

Hi Terry,

Okay, so given the nodes and the configuration, you're more than powerful enough to handle this, however you're not dedicating enough resources to the individual processes.

I suggest QBATCH_PPJ=8 QBATCH_CHUNKSIZE=1 QBATCH_CORES=1 QBATCH_MEM=32G

And mb run --stage-templatelib-procs 1 -stage-voting-procs 1 --stage-templatelib-walltime 8:00:00 --stage-voting-walltime 2:00:00

This should break up the submission into individual jobs allowing for the best scheduling given your system. The reason for all the changes is the original design of the job submission was for a "1 job = 1 computer" style of SLURM configuration, where we needed to pack commands together. Here you're not limited to that, so you can submit many small jobs and let the scheduler handle it.

pohankung commented 1 year ago

Hi Dr. Devenyi,

Just a quick update that I am trialling the pipeline with your suggested configuration. I will keep you posted on our progress. Thank you very much for your assistance thus far.

Warm Regards, Terry

pohankung commented 1 year ago

Hi Dr. Devenyi,

Thank you for your advice on the suitable pipeline. I've attempted the suggestions and the voting jobs unfortunately continued to time out with the above configuration. I've quoted my most recent attempt below for your reference. Please let me know if you need more information for further advice. Thank you very much and I look forward to hearing from you.

!/bin/bash

Load required modules

module load foss/2020b gcccore/10.2.0 python/3.8.6 minc-toolkit/1.9.18.2 ants/2.4.3 parallel/20210322

Activate MAGeT

source /data/scratch/projects/punim1471/MAGeTbrain/bin/activate

Environment variables to customize defaults for local system

export QBATCH_PPJ=8 # requested processors per job export QBATCH_CHUNKSIZE=1 # commands to run per job export QBATCH_CORES=1 # commonds to run in parallel per job export QBATCH_NODES=1 # number of compute nodes to request for the job, typically for MPI jobs export QBATCH_MEM="128G" # requested memory per job export QBATCH_MEMVARS="mem" # memory request variable to set export QBATCH_SYSTEM="slurm" # queuing system to use ("pbs", "sge","slurm", or "local") export QBATCH_QUEUE="physical" # Name of submission queue export QBATCH_OPTIONS="" # Arbitrary cluster options to embed in all jobs export QBATCH_SCRIPT_FOLDER=".qbatch/" # Location to generate jobfiles for submission export QBATCH_SHELL="/bin/sh" # Shell to use to evaluate jobfile

Limit number of threads antsRegistration uses

export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=2

Execute mb run in command line (calling on qbatch to submit commands)

cd /data/scratch/projects/punim1471/NeuroWIRED/Hb_segmentation mb run --stage-templatelib-procs 1 --stage-voting-procs 1 --stage-templatelib-walltime "3:30:00" --stage-voting-walltime "10:00:00"

Many thanks, Terry

gdevenyi commented 1 year ago

I've attempted the suggestions and the voting jobs unfortunately continued to time out

Can you give more details about the jobs themselves? There should be logfiles for all of them.

Can you also completely describe your dataset? Sequence, resolution, number of subjects, preprocessing steps. Sharing screenshots (or an example file) would also help here.

Finally, did you check if any progress was made? mb status will show which steps have been completed.

pohankung commented 1 year ago

Hi Dr. Devenyi,

Apologies for the delayed response as I gather the requested information.

Please see attached example log files for the template and failed voting jobs using the configuration described in my previous comment. I did not run mb status after my past attempts, and I will be reporting this in my comments from now on. slurm-mb_templatelib_2023-04-27T01-57-05-46833816_1.txt slurm-mb_voting_2023-04-27T01-57-05-46833817_50.txt

For Hb segmentation, we are using 5 atlases obtained from your library, as well as 21 templates and 50 subjects from our dataset. The template images were a subset of the total 50 subjects. These subject images were de-noised T1-weighted anatomical image collected on a Siemens 7T research scanner with MEMP2RAGE sequence (first of four echoes; 240 contiguous sagittal slices; repetition time = 4.5 seconds; echo time = 2.22 ms; 320 x 320–pixel matrix; slice thickness = 0.75 mm). These images were originally in Nifti format and pre-processed in SPM12. They were realigned to the mean image, bias-corrected, and skull-stripped in the native space.

The pre-processed Nifti images were then converted to MINC2 format using nii2mnc and mincconvert in minc-toolkit, before input to MAGeT as instructed.

Hope this is helpful and please let me know if you would like more information. Your continued assistance is very much appreciated.

Warm Regards, Terry

pohankung commented 1 year ago

Hi Dr. Devenyi,

Here's a quick update that I've re-attempted the pipeline with an increased wall time of --stage-voting-walltime "12:00:00" and encountered the same time out issue. Please see below quoted output of mb status. It seems that the program did not move past atlas-to-template stage.

mb status 5 atlases, 21 templates, 50 subjects found 0 atlas-to-template registration commands left 566 template-to-subject registration commands left 5250 transform merging commands left 5250 label propagation commands left 50 label fusion commands left

Here are the example log files from the template and voting jobs for your reference. Your advice would be greatly appreciated. slurm-mb_templatelib_2023-05-04T14-44-13-47039899_105.txt slurm-mb_voting_2023-05-04T14-44-13-47039900_50.txt

Also, please let me know if you would prefer discussing these issues over a Zoom meeting. I would be happy to arrange the meeting alongside my research supervisor to potentially collaborate on the project.

Warm Regards and speak soon, Terry

gdevenyi commented 1 year ago

Okay, things to learn from the info

Okay, great this is good info, things to learn from mb status

Questions

If you haven't skull-stripped the atlases, I added masks to the atlas release, https://github.com/CoBrALab/atlases/releases/tag/2.0 you should be matching the "kind" of images during processing, (either stripped or unstripped)

Either way, the pipeline is making progress as per mb status and running again will pick up where it left off if it runs out of time.

pohankung commented 1 year ago

Hi Dr. Devenyi.

Apologies for the delayed response. Please see below responses to your questions:

- did you skull strip the atlases Thank you for the reminder. No, I have input the downloaded atlas brains and the habenula labels directly. Can I confirm if I would need to pre-process the atlas brains through bpipe as instructed here (https://github.com/CoBrALab/minc-bpipe-library)? If so, can I confirm if the correct input file for MAGeT will be <basename>.convert.n4correct.cutneckapplyautocrop.beastextract.mnc -- the extracted T1 in native space, with bias field corrected? Is any transformation required for the supplied habenula labels?

- have you been deleting the output directory? (you shouldn't be) Yes, I have been. I will keep the output directory from now on, given the information that MAGeT will pick up where it left off when it runs out of time. Thank you!

- can you share the hidden .qbatch directory, after having deleted it and run the pipeline again (to avoid keeping really old files) Yes, I have been removing old command array files whenever a new run is submitted. Please see attached the most recent array files (converted to .txt). mb_templatelib_2023-05-04T14-44-13.txt mb_voting_2023-05-04T14-44-13.txt

Thank you in advance for your continued assistance.

Warm Regards, Terry

gdevenyi commented 1 year ago

Can I confirm if I would need to pre-process the atlas brains through

No, I've added a brainmask download to https://github.com/CoBrALab/atlases/releases/tag/2.0, you can use minccalc -expression 'A[0]*A[1]' brain_t1.mnc brain1_mask.mnc brain1_extracted.mnc to produce a skull-stripped version.

Yes, I have been. I will keep the output directory from now on, given the information that MAGeT will pick up where it left off when it runs out of time. Thank you!

This should resolve your overall outstanding issue of running out of time. Its making progress and if you let it pick up where it left off, it should finish.

Yes, I have been removing old command array files whenever a new run is submitted. Please see attached the most recent array files (converted to .txt).

Okay, here I see that -j and --stage-voting-procs interact differently than with --stage-templatelib-procs. You may want to try --stage-voting-procs 8 instead of 1.

Overall, the core reason for the lack of progress was the deletion of the partially complete pipeline results every time. Resumption was built-in for just this purpose in case of timeouts etc. Repeating the runs until completion should work.

pohankung commented 1 year ago

Hi Dr. Devenyi,

Apologies for the delayed response. I've encountered some issues on our cluster and am still trialling your suggestions, I will provide an update soon.

Thank you for your assistance thus far, and speak soon.

Warm Regards, Terry

pohankung commented 1 year ago

Hi again Dr. Devenyi,

I am pleased to report that MAGeT segmentation seems to be successfully performed, except for three of our 50 subjects. Output of mb status indicated that the 3 subjects failed during the final stage of label fusion:

5 atlases, 21 templates, 50 subjects found 0 atlas-to-template registration commands left 0 template-to-subject registration commands left 3 label fusion commands left

The error seems to pertain to a file reading issue. I've copied an example error message below and attached the full output log for your reference:

/data/scratch/projects/punim1471/MAGeTbrain/bin/mb run vote -s brain150_t1 --stage-templatelib-procs 1 --stage-voting-procs 2 --stage-templatelib-walltime 8:00:00 --stage-voting-walltime 10:00:00 -q parallel /usr/local/easybuild-2019/easybuild/build/MINC/1.9.18.2/foss-2020b/minc-toolkit-v2/libminc/libsrc2/volume.c:1420 (from MINC): Error: Trying to open minc file without image variable slurm-mb_voting_2023-05-30T22-35-04-47689749_1.txt

With that said, I've confirmed using mincviewer and PrintHeader that the input MINC images for the failed subjects are not corrupted and appeared normal when compared to those of the subjects that were successfully segmented. Please see attached an example PrintHeader output comparing the failed subject (150) to a successful one (103). PrintHeader_Failed150_vs_Success103.docx

I was wondering if you may have additional suggestions on how I could diagnose and resolve this issue.

Many thanks, Terry

gdevenyi commented 1 year ago

Have you tried to rerun if this repeats?

My guess is that you have an candidate segmentation that is broken/truncated due to a job being interrupted, and the pipeline is simplistic in that it considers it "done".

Assuming that's the case, you can find in output/intermediate/*<SubjectID>_label.mnc the candidate files, and delete them, and rerun the pipeline to recreate them and then try the vote again.

pohankung commented 1 year ago

Hi Dr. Devenyi,

Very pleased to report that the suggested solution was effective! We've successfully created individualised masks for all subjects using MAGeT on our cluster.

Thank you very much for your support and assistance over the past couple of months.

Warm Regards, Terry

gdevenyi commented 1 year ago

Very glad to hear that. Good luck with the analysis!