Open nicoloesch opened 1 month ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 84.75%. Comparing base (
cfc8aaf
) to head (f8335ef
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Update: I am able to run my models with roughly 6GB of memory usage instead of 70-100GB. It appears the change fixed the memory leak.
Hi I do confirm that using this PR, does indeed remove the memory increase observe in this example
Not sure this is related to this issue, but I now wonder what the copy argument of transform is intended for ? I thought that copy=False would then apply the transform directly to the input subject but :
import torchio as tio
suj = tio.datasets.Colin27()
tr = tio.RandomAffine(copy=False)
sujt = tr(suj)
#even though copy=False, sujt is a new instance of suj they are different (and have different data
suj.history
[]
sujt.history
[Affine(scales=(tensor(0.9252, ...
@romainVala I assume the copy argument is to make a deep copy of the subject and only operate on a copy of the subject, whereas the initial loaded subject stays unchanged. I think one use case of copy=True
could be if the Subject contains Image
elements that are initialised with a tensor opposed to an image path. However, I also was wondering whether we should not default this argument to False
instead as we would re-load the inidividual every time we iterate over the DataLoader
if we provide a path argument - At least that is my understanding. What are your thoughts @fepegar?
we would re-load the individual every time we iterate over the DataLoader if we provide a path argument
Is this not what you would expect? Otherwise, you might run out of memory if your dataset is not small.
Is this not what you would expect? Otherwise, you might run out of memory if your dataset is not small.
This is exactly what I expect but wouldn't then the copy be entirely obsolete? We are re-loading it from disk every time so we always have the initial state available. What is the point to protect (through copying) the initial subject, if we never re-store it, i.e. change the image on disk?
this was exactly my point, I do not see how the copy argument change the behaviour
If I test with a tensor, it is the same
import torchio as tio, numpy as np, torch
img = tio.ScalarImage(tensor=torch.rand((1,10,10,10)), affine=np.eye(4))
suj = tio.Subject({'image1':img})
tr = tio.RandomAffine(copy=False)
sujt = tr(suj)
suj.image1.data[0,0,0,:4]
Out[16]: tensor([0.7858, 0.4274, 0.7144, 0.6861])
sujt.image1.data[0,0,0,:4]
Out[17]: tensor([0.7443, 0.5312, 0.6450, 0.3963])
So both original subject and the transformed one are identical despite copy=False argument.
but wait , I find out where it comes from: RandomAffine create an other transform Affine, but forget to pass the copy argument therefore the real transform here is done with copy=True !
If I change the line
https://github.com/fepegar/torchio/blob/cfc8aaff11f346cc3286257ab7903464e7d5cfdb/src/torchio/transforms/augmentation/spatial/random_affine.py#L187
to
transform = Affine(**self.add_include_exclude(arguments), copy=self.copy)
then
tr = tio.RandomAffine(copy=False)
is now working as expected (ie suj and sujt are now identical !)
so this is indeed related to this issue #1223
Note that it is now also working for my first exemple (with subject colin) so the copy argument act similary wheter or not one use tensor as input
@nicoloesch I'm assuming #1228 needs to be merged before this? @romainVala are you happy with the changes in this PR?
Sorry folks, I'm a bit busy these days.
Fixes #1222 .
Description
This change removes the custom
__copy__
logic intorchio.Subject
. As a result, callingcopy.copy()
on atorchio.Subject
will no longer create a mixture of shallow and deep copies. Therefore, tocopy.copy(subject)
copy.deepcopy(subject)
Removing this mix of deep and shallow copies within the method for a shallow copy (
__copy__
) appears to be beneficial for the memory management in python and the garbage collector correctly freeingtorchio.Subject
and their images that are no longer referenced.This required the change from
copy.copy
tocopy.deepcopy
in thetorchio.Transform
base class to not modify the original subject but rather create a copy of it that is modified using the transform (applied only ifcopy=True
during instantiation of any transform). I assume this is the excepted behaviour based on the provided documentation and code logic.Checklist
CONTRIBUTING
docs and have a developer setup (especially important arepre-commit
andpytest
)pytest
make html
inside thedocs/
folderImportant Notes
The absence of the memory leak has only been verified on my local machine as outlined in the comment in the original issue and should be verified independently by the reviewer on their machine given the instructions in the linked comment.