Closed npgretz closed 4 years ago
Hi, could you please provide the version 'git log' top entry. The run environment (cluster (CPUs and memory), workstation (cpus and memory), vm, OS) and the complete output from the run. Thanks.
This occurs when specifying the '--rigid-model-target' option to specify a MNI template as the target.
Are you saying when you run without --rigid-model-target
you don't get this error? If so, it sounds like your files are not in proper RAS orientation like the MNI target. Please try visualizing them together to confirm.
Git Log top entry for twolevel_dbm:
commit b118f7a7b109430c7d899c367073db1d154f5ad3 (HEAD -> master, origin/master, origin/HEAD) Merge: 7d489cd 5ada87c Author: Gabriel A. Devenyi gdevenyi@gmail.com Date: Tue Mar 31 14:01:10 2020 -0400
Merge pull request #37 from CoBrALab/fwhm
Switch to default blur of FWHM=2x min res
Run Environment: Linux Server which contains ANTS, twolevel_dbm.py, Python, and MRI data.
$ cat /proc/version Linux version 3.10.0-1062.12.1.el7.x86_64 (mockbuild@sl7-uefisign.fnal.gov) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) )
$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 48 On-line CPU(s) list: 0-47 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 1 NUMA node(s): 4 Vendor ID: AuthenticAMD CPU family: 23 Model: 1 Model name: AMD EPYC 7401 24-Core Processor Stepping: 2 CPU MHz: 1200.000 CPU max MHz: 2000.0000 CPU min MHz: 1200.0000 BogoMIPS: 3999.51 Virtualization: AMD-V L1d cache: 32K L1i cache: 64K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0-5,24-29 NUMA node1 CPU(s): 6-11,30-35 NUMA node2 CPU(s): 12-17,36-41 NUMA node3 CPU(s): 18-23,42-47
$ cat /proc/meminfo MemTotal: 131754640 kB MemFree: 1440796 kB MemAvailable: 121307240 kB Buffers: 5800 kB Cached: 122950316 kB SwapCached: 100248 kB Active: 64616736 kB Inactive: 61872728 kB Active(anon): 3438912 kB Inactive(anon): 4363288 kB Active(file): 61177824 kB Inactive(file): 57509440 kB Unevictable: 64 kB Mlocked: 64 kB SwapTotal: 50331644 kB SwapFree: 49511164 kB Dirty: 16 kB Writeback: 0 kB AnonPages: 3446020 kB Mapped: 295580 kB Shmem: 4268848 kB Slab: 2244540 kB SReclaimable: 1850272 kB SUnreclaim: 394268 kB KernelStack: 33600 kB PageTables: 114380 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 116208964 kB Committed_AS: 18915208 kB VmallocTotal: 34359738367 kB VmallocUsed: 405228 kB VmallocChunk: 34258111484 kB HardwareCorrupted: 0 kB AnonHugePages: 493568 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 437840 kB DirectMap2M: 15187968 kB DirectMap1G: 118489088 kB
Workstation used to SSH into Server:
OS Name Microsoft Windows 10 Education
Version 10.0.18362 Build 18362
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Name DESKTOP-CEPU2Q8
System Manufacturer To Be Filled By O.E.M.
System Model To Be Filled By O.E.M.
System Type x64-based PC
System SKU To Be Filled By O.E.M.
Processor AMD Ryzen 7 3800X 8-Core Processor, 3900 Mhz, 8 Core(s), 16 Logical Processor(s)
BIOS Version/Date American Megatrends Inc. P1.00, 6/18/2019
SMBIOS Version 3.2
Embedded Controller Version 255.255
BIOS Mode Legacy
BaseBoard Manufacturer ASRock
BaseBoard Product X570 Steel Legend WiFi ax
BaseBoard Version
Platform Role Desktop
Secure Boot State Unsupported
PCR7 Configuration Binding Not Possible
Windows Directory C:\WINDOWS
System Directory C:\WINDOWS\system32
Boot Device \Device\HarddiskVolume4
Locale United States
Hardware Abstraction Layer Version = "10.0.18362.628"
Time Zone Central Daylight Time
Installed Physical Memory (RAM) 63.9 GB
Total Physical Memory 63.9 GB
Available Physical Memory 57.1 GB
Total Virtual Memory 73.4 GB
Available Virtual Memory 62.7 GB
Page File Space 9.50 GB
Page File C:\pagefile.sys
Kernel DMA Protection Off
Virtualization-based security Not enabled
Device Encryption Support Reasons for failed automatic device encryption: TPM is not usable, PCR7 binding is not supported, Hardware Security Test Interface failed and device is not Modern Standby, Un-allowed DMA capable bus/device(s) detected, TPM is not usable
Hyper-V - VM Monitor Mode Extensions Yes
Hyper-V - Second Level Address Translation Extensions Yes
Hyper-V - Virtualization Enabled in Firmware Yes
Hyper-V - Data Execution Protection Yes
Here is the entirety of the output:
$ /apps/twolevel_ants_dbm/twolevel_dbm.py --rigid-model-target MNI.nii.gz 1level input.csv
Processing Second-Level DBM outputs
0%| | 0/2 [00:02<?, ?it/s]
Traceback (most recent call last):
File "/apps/twolevel_ants_dbm/twolevel_dbm.py", line 714, in <module>
main()
File "/apps/twolevel_ants_dbm/twolevel_dbm.py", line 710, in main
secondlevel(inputs, args, secondlevel=False)
File "/apps/twolevel_ants_dbm/twolevel_dbm.py", line 281, in secondlevel
args.dry_run,
File "/apps/twolevel_ants_dbm/twolevel_dbm.py", line 35, in run_command
shell=True,
File "/apps/miniconda3/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'ANTSUseDeformationFieldToGetAffineTransform output/secondlevel/secondlevel_01601InverseWarp.nii.gz 0.25 affine output/compositewarps/secondlevel/016_delin.mat output/secondlevel/secondlevel_otsumask.nii.gz' died with <Signals.SIGABRT: 6>.
Thanks for the details of the system,
This occurs when specifying the '--rigid-model-target' option to specify a MNI template as the target.
Are you saying when you run without --rigid-model-target
you don't get this error? If so, it sounds like your files are not in proper RAS orientation like the MNI target. Please try visualizing them together to confirm.
Sorry, I forgot to reply to your earlier response. I am able to run it without --rigid-model-target. I did check the RAS orientation and they all are in the same orientation.
I am able to run it without --rigid-model-target. I did check the RAS orientation and they all are in the same orientation.
This failure hints that something about your model choice is causing an issue. I would suggest first trying to the visualize the output/secondlevel/secondlevel_template0.nii.gz
simultaneously with the MNI.nii.gz
, in MINC-land I would do this with register
, but maybe ITKsnap or such can achieve this.
A few more questions
MNI.nii.gz
(which MNI model, is it extracted, what resolution is it)Thanks for your quick responses @gdevenyi , Nick has been the one figuring out how to run these programs for my project. The initial confusion came with the rigid body transformation - I had misunderstood this as the common space transformation. I thought the rigid body was creating a single rigid image out of all input images, not rigidly transforming them all resulting in bad registration before the creation of our template. We are trying to bring all individuals (stripped and bias field corrected T1's) into MNI Space (extracted, the 2009 version, 1x1x1 resolution) and then create our template from there. So the larger problem was me misunderstanding the functions! I am wondering then, does the resample to common space option create an unbiased image based off all input images, then applies that to the MNI (if we input that as our common space option)? Or does it take all input images into MNI space and then create an averaged template from that? Thanks in advance!
Hi,
The --rigid-model-target option is used to define the orientation and sampling of the space in which the unbiased template is then constructed. This means, if provided, inputs are rigidly registered and resampled to the target, and then averaged. From there, that average is used as the target for a regular modelbuild. If this option is not provided, all images are dumb-averaged without alignment, then all images are rigidly registered to that dumb-average and averaged, and finally the model build begins.
Separate from that, the --resample-to-common-space option has no impact at all on the unbiased template construction. It is a post-processing option, where the final unbiased template is registered to that target, and final jacobians are also generated in that final common space provided in addition to the unbiased space. It can be used with, or without the --rigid-model-target option.
To make sure I understand, If we use the rigid body with our MNI template, all images will be transformed via a rigid transformation (which means preserving shape and relative size correct? Not a affline transformation) to that MNI space and then averaged. I'm having trouble articulating my next issue so here is my best attempt: if we have individuals with varying brain sizes, will a rigid transformation be our best option? I feel like some of the issues we have been observing are a result of brain sizes not matching and the rigid body transformation not being able to adequately match them. I have attached an image which seems to highlight my question but I look forward to your feedback. Thanks again!
To make sure I understand, If we use the rigid body with our MNI template, all images will be transformed via a rigid transformation (which means preserving shape and relative size correct? Not a affine transformation)
Correct
I'm having trouble articulating my next issue so here is my best attempt: if we have individuals with varying brain sizes, will a rigid transformation be our best option?
This is only the initalization, so don't be too concerned with differences with sizes. The full modelbuild is a rigid-affine-syn and will converge on a unbiased average, if you intialize with something else, you will, by definition be biasing the model. What you're showing in your screenshot is the common issue that the MNI model is 30% larger than most brains. More important is that your brains did get well aligned in terms of rotation and centering, so that the model will axes will be aligned well well it gets built.
Regarding the original bug report, I have just encountered the same error in a test run. I believe we have uncovered an ANTs bug, as when I used an older version of ANTs, the bug went away with no changes to my code.
How you can help me here is to check your ANTs version and report back, and share the output/secondlevel/secondlevel.log
file which I believe will shows some symptoms I saw in my logs prior to the error you reported.
Here is the Traceback:
Traceback (most recent call last): File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 714, in <module> main() File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 710, in main secondlevel(inputs, args, secondlevel=False) File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 281, in secondlevel args.dry_run, File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 35, in run_command shell=True, File "/study/gvpain/apps/miniconda3/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command 'ANTSUseDeformationFieldToGetAffineTransform output/secondlevel/secondlevel_01601InverseWarp.nii.gz 0.25 affine output/compositewarps/secondlevel/016_delin.mat output/secondlevel/secondlevel_otsumask.nii.gz' died with <Signals.SIGABRT: 6>.
We are on version 2.3.1 of ANTs.
That output has nothing to do with the modelbuild, nor does it make sense in light of the other error you shared before, where the modelbuild had been completed and postprocessing was proceeding. Can you please cleanup everything and run the pipeline from scratch, saving all output, and the complete commands run?
We are on version 2.3.1 of ANTs.
The error in comment 1 also doesn't make sense in light of that, can you please run antsRegistration --version
. I have now confirmed that the latest ANTs git master fixes the error you reported in comment 1 for me.
Here is the output from antsRegistration --version:
ANTs Version: 2.3.1.dev159-gea5a7 Compiled: Apr 8 2020 11:22:49
I have been attempting to recreate the original error but now the script is getting hung up when calling antsMultivariateTemplateConstruction2.sh:
(base) [gretzon@baldi ants_rigid]$ python /study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py --rigid-model-target 0000MNI.nii.gz 1level input.csv Running Second-Level Modelbuild Traceback (most recent call last): File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 714, in <module> main() File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 710, in main secondlevel(inputs, args, secondlevel=False) File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 242, in secondlevel results = run_command(command, args.dry_run) File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 35, in run_command shell=True, File "/study/gvpain/apps/miniconda3/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command 'antsMultivariateTemplateConstruction2.sh -d 3 -o output/secondlevel/secondlevel_ -r 1 -l 1 -y 0 -c 0 -a 1 -e 1 -g 0.25 -i 3 -n 0 -m CC[4] -t SyN -u 20:00:00 -v 8gb -q 100x100x70x20 -f 6x4x2x1 -s 3x2x1x0 -z 0000MNI.nii.gz 016.nii.gz 04.nii.gz && echo DONE > output/secondlevel/COMPLETE' returned non-zero exit status 1.
I have edited my comments above with the full original traceback that I had saved from earlier tries. I am unable to find any other output in /output/secondlevel/secondlevel.log
other than what I had posted pertaining to qsub
. With the script now hanging up on antsMultivariateTemplateConstruction2.sh, the secondlevel.log is not being produced.
Something doesn't add up here,
The qsub error you have now deleted is triggered here: https://github.com/ANTsX/ANTs/blob/master/Scripts/antsMultivariateTemplateConstruction2.sh#L644-L649
Which only happens if you specify SGE or PBS as the -c option "-c 1" or "-c 4". But the command above clearly shows "-c 0".
Have you modified antsMultivariateTemplateConstruction2.sh to override the cluster mode?
Anyways, I can confirm building ANTs HEAD of today fixes the original issue reported with my testing. Please build/install a newer version.
A fresh install of ANTs HEAD did resolve the issue. Thank you very much for your guidance in using twolevel_dbm and for resolving this issue! I apologize for my confusion throughout this process.
I had forgotten to direct the output from trying to run antsMultivariateTemplateConstruction2.sh directly to a different file than my twolevel_dbm.py run so I believe it was overwritten causing my confusion. I was trying use qsub from our server with "-c1".
I am now successfully using twolevel_dbm.py --rigid-model-target.
Thank you again for your valuable time and assistance!
Wonderful. Glad to hear everything has been sorted out.
Please note I did a bunch of work to the code since you surfaced and asked me a bunch of questions. :)
You can see the changes here: https://github.com/CoBrALab/twolevel_ants_dbm/compare/b118f7a...master
All the changes are "non-functional" but improve logging which will enable me to help out quicker in the future.
Please don't hesitate to open new issues if you run into anything.
Hello again! Hopefully this just will show all of the rewards from all of those logging changes you worked hard to make when you first assisted me.
We also tried constructing a template using --resample-to-common-space and we have run into the same error with ANTSUseDeformationFieldToGetAffineTransform.
Call:
twolevel_dbm.py --resample-to-common-space MNI.nii.gz 1level input.csv
I believe the template was still completed but I am unsure if the resampling was successful. COMPLETE was created and secondlevel.log states "Done creating ... secondlevel_template0.nii.gz". secondlevel_template0.nii.gz looks good although the compositewarps and jacobians directories are empty. My job exited normally but the error log shows:
0%| | 0/60 [00:00<?, ?it/s] 0%| | 0/60 [00:01<?, ?it/s] Traceback (most recent call last): File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 714, in <module> main() File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 710, in main secondlevel(inputs, args, secondlevel=False) File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 281, in secondlevel args.dry_run, File "/study/gvpain/apps/twolevel_ants_dbm/twolevel_dbm.py", line 35, in run_command shell=True, File "/study/gvpain/apps/miniconda3/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command 'ANTSUseDeformationFieldToGetAffineTransform output/secondlevel/secondlevel_00601InverseWarp.nii.gz 0.25 affine output/compositewarps/secondlevel/006_delin.mat output/secondlevel/secondlevel_otsumask.nii.gz' died with <Signals.SIGABRT: 6>.
Here is the secondlevel.log.
Was the resample successful if ANTSUseDeformationFieldToGetAffineTransform was not?
At least your secondlevel is failing with errors, your first level may also have this problem, check the subject-level log files:
18556 Exception caught:
18557 itk::MemoryAllocationError (0x467d9b0)
18558 Location: "unknown"
18559 File: /study/gvpain/apps/ANTs/build/staging/include/ITK-5.1/itkImportImageContainer.hxx
18560 Line: 192
18561 Description: Failed to allocate memory for image.
18562
Of course, antsMultilevelTemplateConstruction2.sh doesn't catch errors properly, so the pipeline keeps running, off the rails until it exits normally, and my pipeline can't detect such failures. ref, my 3 year old bug: https://github.com/ANTsX/ANTs/issues/397. I guess to improve my pipeline, I'll need to improve their scripts.
If you're willing to be a guinea-pig for this, add set -euo pipefail
right at the top of antsMultilevelTemplateConstruction2.sh and we'll see if this catches that error and bails out properly.
To fix this properly, you'll have to adjust how the jobs are being submitted to the cluster/run on a machine with more ram available. I expose what options the ants pipeline has, otherwise you'll need to edit their script to adjust job submission parameters.
In the future, please open new bugs for new issues rather than reopening old ones. Thanks.
I am receiving an error with ANTSUseDeformationFieldToGetAffineTransform when twolevel_dbm.py calls it through subprocess.py. This occurs when specifying the '--rigid-model-target' option to specify a MNI template as the target. The MNI template is not listed in the input .csv.
Here is the call:
python twolevel_dbm.py --rigid-model-target MNI.nii.gz 1level input.csv
Here is the relevant traceback: Traceback (most recent call last):
File "/apps/twolevel_ants_dbm/twolevel_dbm.py", line 714, in <module> main() File "/apps/twolevel_ants_dbm/twolevel_dbm.py", line 710, in main secondlevel(inputs, args, secondlevel=False) File "/apps/twolevel_ants_dbm/twolevel_dbm.py", line 281, in secondlevel args.dry_run, File "/apps/twolevel_ants_dbm/twolevel_dbm.py", line 35, in run_command shell=True, File "/apps/miniconda3/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command 'ANTSUseDeformationFieldToGetAffineTransform output/secondlevel/secondlevel_01601InverseWarp.nii.gz 0.25 affine output/compositewarps/secondlevel/016_delin.mat output/secondlevel/secondlevel_otsumask.nii.gz' died with <Signals.SIGABRT: 6>.
Thank you for your insight and time! Nick