ANTsX / ANTs

Advanced Normalization Tools (ANTs)
Apache License 2.0
1.19k stars 380 forks source link

Unexpected behavior of antsMultivariateTemplateConstruction2.sh #1339

Closed valeryozenne closed 9 months ago

valeryozenne commented 2 years ago

Hi,

I can't find a solution to the following issue, if you have any clues , thanks in advance.

I'm building a template with 'antsMultivariateTemplateConstruction2.sh' with 12 sets of images . For some reasons, one job (always the same one) is not put in queue but this is not fully reproducible if you try it several times.

Capture d’écran de 2022-04-21 10-19-42

Closely,

Capture d’écran de 2022-04-21 10-20-18

I'm running my computation on a local computer, the command line is the following:

logCmd antsMultivariateTemplateConstruction2.sh -d 3 -i 4 -k 2 -w 0.5x1 -c 2 -j 6 ${MINC_WINTOUT_SLASH} -t SyN  -n 0 -m CC -r 1 -o ${FICHIER_TEMPLATE}  liste_de_fichier_copiee_ici_${NOW}.csv 

I didn't find any issue before the call of the jobs but it could be possible that something wrong trigger this ?

I can share the data if necessary. Thanks in advance, Best regards, Valéry

ntustison commented 2 years ago

What happens when you try with antsMultivariateTemplateConstruction.sh?

cookpa commented 2 years ago

Just a hunch, but I somewhat suspect a problem with the periods in your output directory path.

Can you reproduce the problem if there are no periods in the directory or file names before the file extension?

valeryozenne commented 2 years ago

Thanks for your advice. I did additional tests but I cannot solve or even clearly isolate the problem.

using N=24

Capture d’écran de 2022-04-26 10-19-21

using antsMultivariateTemplateConstruction.sh

--------------------------------------------------------------------------------------
 Starting ANTS rigid registration on max 6 cpucores. 
 Progress can be viewed in /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/job*_metriclog.txt
--------------------------------------------------------------------------------------
Using max 6 parallel threads
Running sh /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/job0_r.sh
Running sh /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/job10_r.sh
Running sh /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/job11_r.sh
Running sh /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/job1_r.sh
Running sh /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/job2_r.sh
Running sh /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/job3_r.sh
AFFINE: /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/rigid2_0_S12_TI2_reoriented_N4_resampled_to_0Affine.txt
moving_image_filename: /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/S12_TI2_reoriented_N4_resampled_to_0.5.nii.gz components 1
output_image_filename: /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/rigid2_0_S12_TI2_reoriented_N4_resampled_to_0.5.nii.gz
reference_image_filename: /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/MYtemplate0.nii.gz
[0/1]: AFFINE: /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/rigid2_0_S12_TI2_reoriented_N4_resampled_to_0Affine.txt
User Linear interpolation 
HDF5-DIAG: Error detected in HDF5 (1.12.1) thread 0:
  #000: /home/vozenne/Dev/antsInstallExample/build/ITKv5/Modules/ThirdParty/HDF5/src/itkhdf5/src/H5Fdeprec.c line 156 in itk_H5Fis_hdf5(): unable to determine if file is accessible as HDF5
    major: File accessibility
    minor: Not an HDF5 file
  #001: /home/vozenne/Dev/antsInstallExample/build/ITKv5/Modules/ThirdParty/HDF5/src/itkhdf5/src/H5VLcallback.c line 3769 in itk_H5VL_file_specific(): file specific failed
    major: Virtual Object Layer
    minor: Can't operate on object
  #002: /home/vozenne/Dev/antsInstallExample/build/ITKv5/Modules/ThirdParty/HDF5/src/itkhdf5/src/H5VLcallback.c line 3699 in H5VL__file_specific(): file specific failed
    major: Virtual Object Layer
    minor: Can't operate on object
  #003: /home/vozenne/Dev/antsInstallExample/build/ITKv5/Modules/ThirdParty/HDF5/src/itkhdf5/src/H5VLnative_file.c line 384 in itk_H5VL__native_file_specific(): error in HDF5 file check
    major: File accessibility
    minor: Unable to initialize object
  #004: /home/vozenne/Dev/antsInstallExample/build/ITKv5/Modules/ThirdParty/HDF5/src/itkhdf5/src/H5Fint.c line 1073 in itk_H5F__is_hdf5(): unable to open file
    major: File accessibility
    minor: Unable to initialize object
  #005: /home/vozenne/Dev/antsInstallExample/build/ITKv5/Modules/ThirdParty/HDF5/src/itkhdf5/src/H5FD.c line 723 in itk_H5FD_open(): open failed
    major: Virtual File Layer
    minor: Unable to initialize object
  #006: /home/vozenne/Dev/antsInstallExample/build/ITKv5/Modules/ThirdParty/HDF5/src/itkhdf5/src/H5FDsec2.c line 352 in H5FD__sec2_open(): unable to open file: name = '/workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/rigid2_0_S12_TI2_reoriented_N4_resampled_to_0Affine.txt', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0
    major: File accessibility
    minor: Unable to open file
Exception caught during WarpImageMultiTransform.

[...]

/home/vozenne/Dev/antsInstallExample/install/bin/antsMultivariateTemplateConstruction.sh: line 278: 2701277 Segmentation fault      (core dumped) ${ANTSPATH}/AverageImages $dim $output 2 ${images[@]}
summarizeimageset: ERROR - output file /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/MYtemplate0.nii.gz could not be created
ERROR: command exited with nonzero status 1
Command: antsMultivariateTemplateConstruction.sh -d 3 -i 4 -k 1 -c 2 -j 6 -m 225x75x25 -t GR -n 0 -s CC -r 1 -o /workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/Processing_ANTs//Results_Template/Template_Debug_12_Ti2//Resolution_05/Syn_Template_05/MY liste_de_fichier_copiee_ici_04_26_2022_43_07.csv
cookpa commented 2 years ago

Can you try this example? templateCommandMultivariateBSplineSyN.sh from here

https://github.com/ntustison/TemplateBuildingExample/blob/master/BrainSlices/templateCommandMultivariateBSplineSyN.sh

valeryozenne commented 2 years ago

It works well. I also tested with a parallel call using the following -c 2 -j 6. So either something is wrong with my data or my script. I keep looking.

gdevenyi commented 2 years ago

looks like this system might be configured in french, can you export LC_ALL=C to override the localization settings and see if that fixes things?

cookpa commented 2 years ago

Good suggestion @gdevenyi

I am really puzzled by the intermittent nature of the problem. Is it possible that a disk is filling up or a quota is being enforced?

gdevenyi commented 2 years ago

My other thought was something malformed in the CSV? (we haven't seen it). Some of my users helpfully make their files on OSX/windows and end up with broken line endings.

cookpa commented 2 years ago

Yes, Python defaults to Excel style for CSV, which uses Windows newlines regardless of the system.

Some other suggestions:

  1. Ensure each run starts fresh, don't re-run over existing output. Let us know if the problem is reproducible that way.
  2. You can try bash -x antsMultivariateTemplateConstruction2.sh ... | tee debugLog.txt to enable debug mode. This will print a lot of information to the terminal, but might yield some clues
valeryozenne commented 2 years ago

Thanks a lot for all suggestions. Here is my status:

I'm currently lost ! Indeed, before trying the suggestion of @gdevenyi . I re-run the script , and I cannot currently reproduce the problem anymore. But I have no idea why.

The .csv were fine. Here is the output of "locale" command:

vozenne@bigcalculo:/workspace_QMRI/PROJECTS_DATA/2022_RECH_Template_Bruker/CODE_ANTs$ locale
LANG=C.UTF-8
LANGUAGE=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=

`` So I have some templates to build in the next days. I keep you notice when if it is coming back. I might have some future question about parcellation.

cookpa commented 2 years ago

Related to the CSV file, I can confirm that if I make an imagelist.csv the old fashioned way on Mac OS (definitely no Windows newlines), I can run

${ANTSPATH}/antsMultivariateTemplateConstruction2.sh \
  -d 2 \
  -o ${outputPath}T_ \
  -i 4 \
  -g 0.2 \
  -j 4 \
  -c 2 \
  -k 2 \
  -w 1x1 \
  -f 8x4x2x1 \
  -s 3x2x1x0 \
  -q 100x70x50x10 \
  -n 1 \
  -r 1 \
  -l 1 \
  -m CC[2] \
  -t BSplineSyN[0.1,26,0] \
  imagelist.csv

correctly on the brain slices.