Open jpgill86 opened 5 years ago
Hi jeffrey, thank you to catch this.Andrew have a long date plan to make a pseudo list object to deal with relationship in neo. It is catually a mess. At implementtation time, we could consider make weakref at least for one way bottum up.
The fact that file are still open when del to all object is strange. Of course when rawio many object have an internal memmap to data. So yes the file stay open when object is still alive. Note that for proxyobject there is a link to the IO : self._rawio I guess this could be a weakref. The ptach is easy to do.
Andrew have a long date plan to make a pseudo list object to deal with relationship in neo. It is catually a mess. At implementtation time, we could consider make weakref at least for one way bottum up.
Yes, this sounds like a good way to do it eventually.
Of course when rawio many object have an internal memmap to data. So yes the file stay open when object is still alive.
Exactly. A strong reference to the RawIO persists in each proxy object, and the memmaps keep the file open. Each proxy object persists, even after del io; del blk
, because of the circular references, until garbage collection runs.
Note that for proxyobject there is a link to the IO : self._rawio I guess this could be a weakref. The ptach is easy to do.
Good point, it may be easier to solve the file lock issue (not the memory leak problem) with a few targeted weakrefs.
However, the first few things I tried didn't work.
I tried replacing in each proxy object
self._rawio = rawio
with
self._rawio = weakref.proxy(rawio)
This had the effect of allowing the file to unlock with del io
, and del blk
is not needed because there are no strong references to the RawIO in blk
. Unfortunately, this means that the following variant of reading the file will lose all its strong references, and the RawIO will be destroyed immediately:
blk = neo.io.get_io(filename).read_block(lazy=True)
Here I left out the intermediate step of creating io
, which was the only strong reference to the RawIO.
This would probably break a lot of existing code, so that's not a good route to go.
I briefly thought maybe we needed weakrefs to the proxy objects themselves in BaseFromRaw
(e.g., anasig = weakref.proxy(AnalogSignalProxy(rawio=self, ...))
), but this doesn't work either because the proxy objects are destroyed instantly.
I next tried adding a single strong reference to the RawIO in the seg
returned by read_segment
(as a first try, I did if lazy: seg.rawio = self
), thinking we could then use self._rawio = weakref.proxy(rawio)
in each proxy object. However, this leads us back to the original problem: The segment is never destroyed even after del blk
because of all the circular references.
The best solution I can think of right now, other than a widespread overhaul of relationships that uses weakrefs in all the right places, is to add a close_file
method to blk
objects returned by read_block
when lazy=True
, which could traverse its children and get rid of every reference to the RawIO.
Does that sound like a good idea or a bad one? Would it be better to live with the current behavior that locks the file and use manual garbage collection, until relationships can be overhauled? Or is the "quick fix" of adding close_file
a reasonable short term alternative?
Thanks for this big investigation.
I did not figure out alll theses details.
The wiser way is working of relationship with weakref, so the Andrew's plan.
@apdavison @JuliaSprenger ?
With code like this,
the file
filename
will remain locked until the Python process ends.I believe the cause is circular references to proxy objects, such as this:
https://github.com/NeuralEnsemble/python-neo/blob/70f78117bc2173a25e21149f3cdc69262df6e446/neo/io/basefromrawio.py#L264-L265
These circular references appear all over the place in Neo and are not unique to proxy objects. I suspect that because of them, most Neo objects in memory are not destroyed even after commands like
del io; del blk
. Not only is this a source of a memory leak, but in the case of lazy loading using proxy objects, the file becomes locked (e.g., it cannot be deleted in Windows at least).Manually performing garbage collection is a workaround:
However, the best solution I think would be to use weak references.