Open JescoS opened 1 week ago
Out of memory? Try increasing RAM available to docker?
Are the other images that work successfully as large as these? 1024x1024x175 is quite large.
It also seems to fail after a different number of iterations every time I run the provided code.
I suspect it is converging at different numbers of iterations, then running out of memory. Before each stage, it will print
DIAGNOSTIC,Iteration,metricValue,convergenceValue,ITERATION_TIME_INDEX,SINCE_LAST
If this is consistently printed twice, it's probably that it's converging differently due to random sampling and then trying and failing to start the next level.
I'm having a similar memory issue when registering multiple batches of images. Memory usage seems to increase over time, suggesting that memory is not cleared properly.
I can see a small increase in the number of objects and the RSS memory over time, when running registration in a loop. Just registering a normal T1w image to MNI, repeatedly
import ants
import gc
import psutil
import sys
import time
t1w = ants.image_read('t1w.nii.gz')
template = ants.image_read('tpl-MNI152NLin2009cAsym_res-01_T1w.nii.gz')
process = psutil.Process()
# Function to convert bytes to MB
def bytes_to_mb(bytes):
return round(bytes / 1024 / 1024, 3)
def get_object_description(obj):
"""Generate a description of the object."""
if isinstance(obj, (list, set, tuple)):
return f"{type(obj).__name__} of size {len(obj)}: {repr(obj)[:200]}"
elif isinstance(obj, dict):
return f"{type(obj).__name__} with {len(obj)} keys: {repr(obj)[:200]}"
elif isinstance(obj, str):
return f"String of length {len(obj)}: {repr(obj)[:200]}"
else:
return repr(obj)[:200]
def show_largest_objects(limit=10):
"""Print the largest objects in memory, in MB."""
gc.collect()
objects = gc.get_objects()
print(f"Total objects in memory: {len(objects)}")
sorted_objects = sorted(objects, key=lambda x: sys.getsizeof(x), reverse=True)[:limit]
for obj in sorted_objects:
size_in_mb = bytes_to_mb(sys.getsizeof(obj))
description = get_object_description(obj)
print(f"{description} - {size_in_mb} MB")
total_memory = list()
total_objects = list()
reg = None
try:
for i in range(100):
gc.collect()
memory_info = process.memory_info()
total_memory.append(bytes_to_mb(memory_info.rss))
total_objects.append(len(gc.get_objects()))
print(f"RSS Memory: {bytes_to_mb(memory_info.rss):.2f} MB")
print(f"VMS Memory: {bytes_to_mb(memory_info.vms):.2f} MB")
# show_largest_objects(5)
reg = ants.registration(template, t1w, aff_iterations=(25, 0, 0, 0), reg_iterations=(1, 0, 0, 0) )
print('Done iteration ' + str(i))
except KeyboardInterrupt:
print("Monitoring stopped.")
print("Final stats")
gc.collect()
memory_info = process.memory_info()
print(f"RSS Memory: {bytes_to_mb(memory_info.rss):.2f} MB")
show_largest_objects(10)
I ran this for 56 iterations before interrupting
> memory_values
[1] 309.633 1901.070 1990.355 1991.445 1993.828 2002.277 2009.391 2009.805 2010.035 2010.977 2011.438 2017.438 2021.508 2022.109 2022.117 2022.137 2022.352 2022.359 2024.039 2024.055 2024.062
[22] 2028.086 2028.105 2028.117 2028.297 2028.297 2030.715 2030.734 2030.738 2030.766 2030.766 2030.773 2030.777 2030.777 2030.777 2030.785 2030.785 2030.797 2030.801 2030.820 2030.836 2030.871
[43] 2030.871 2030.871 2030.875 2030.875 2030.992 2037.000 2037.023 2037.023 2043.039 2043.039 2043.082 2043.082 2043.082 2043.113 2043.121 2043.121 2043.121 2043.152 2043.328
> diff(memory_values)
[1] 1591.437 89.285 1.090 2.383 8.449 7.114 0.414 0.230 0.942 0.461 6.000 4.070 0.601 0.008 0.020 0.215 0.007 1.680 0.016 0.007 4.024
[22] 0.019 0.012 0.180 0.000 2.418 0.019 0.004 0.028 0.000 0.007 0.004 0.000 0.000 0.008 0.000 0.012 0.004 0.019 0.016 0.035 0.000
[43] 0.000 0.004 0.000 0.117 6.008 0.023 0.000 6.016 0.000 0.043 0.000 0.000 0.031 0.008 0.000 0.000 0.031 0.176
>
Because the difference is small (after the first time) and variable, I'm not sure this is actually a leak on the ANTsPy / ITK side.
Describe the bug
Python crashes without any error when calling
ants.registration
to rigidly register one MRI image to another. This only occurs with two specific images in my dataset.To reproduce
Expected behavior
The registration to happen successfully, or at least that a helpful error message is provided.
Screenshots
Output from the code provided under To reproduce
``` antsRegistration -d 3 -r [0x55be9ed2c2d0,0x55be9ed36050,1] -m mattes[0x55be9ed2c2d0,0x55be9ed36050,1,32,regular,0.2] -t Rigid[0.25] -c 2100x1200x1200x10 -s 3x2x1x0 -f 6x4x2x1 -u 1 -z 1 -o [/tmp/tmpevgyaz8y,0x55be9ecbb2c0,0x55be9f006d20] -x [NA,NA] --float 1 --write-composite-transform 0 -v 1 All_Command_lines_OK Using single precision for computations. ============================================================================= The composite transform comprises the following transforms (in order): 1. Center of mass alignment using fixed image: 0x55be9ed2c2d0 and moving image: 0x55be9ed36050 (type = Euler3DTransform) ============================================================================= Reading mask(s). Registration stage 0 No fixed mask No moving mask number of levels = 4 fixed image: 0x55be9ed2c2d0 moving image: 0x55be9ed36050 Dimension = 3 Number of stages = 1 Use histogram matching = true Winsorize image intensities = false Lower quantile = 0 Upper quantile = 1 Stage 1 State Image metric = MattesMI Fixed image = Image (0x55be9f0b6ff0) RTTI typeinfo: itk::ImageANTsPy installation (please complete the following information):
OS: [ Windows, Ubuntu ]
Additional context The registration only seems to fail for a particular combination of images in my dataset, so I cannot reproduce the error with any publicly available data, unfortunately. It also seems to fail after a different number of iterations every time I run the provided code.