Open huynhjl opened 5 years ago
As far as I can tell by looking at a memory dump, it looks like Disposer.records
indirectly holds a reference to the graph preventing it from being garbage collected. This is because Session.apply
adds a closeFn
function to graph.nativeHandleWrapper.preCleanupFunctions
for the graph to clean up the session and close the graph.reference but that in itself prevents garbage collection.
@eaplatanios I think this issue is the most relevant when considering large scale experimentation/training and hyper-parameter tuning using TF_Scala.
Currently implementations such as TunableTFModel
in the DynaML API rely on graph.close()
to free up resources.
Let us know if there is any way I, @huynhjl or others can help in resolving this. Although my understanding of the codebase is still a bit high level.
cc @sbrunk @lucataglia @DirkToewe
I'm going to look into it.
I'm sorry I've been off TF Scala for a while, working on other projects. @mandar2812 @DirkToewe @sbrunk if you're interested, we could have a conference call at some point to help you understand the codebase at a deeper level. Just let me know and we can plan it.
@eaplatanios I would love that! Maybe we should make a doodle and fix a time thats okay for all of us? What do you think @sbrunk @DirkToewe ?
A tour of the project would be greatly appreciated! I just need like two days to take a look at the code again (It's been a while) so I can ask better questions. There is an unofficial Tensorflow(JS) Discord server that we could use to coordinate and talk.
I've been a bit disconnected from TF Scala since I left academia but I'd still be interested in joining a call about the codebase. I'm also super interested in what you think about Swift for TF since I've seen you've worked with it too :)
Sounds good to me! And yes, I've been working on Swift for TF for quite some time now and would also be happy to talk about that. :) Does someone want to coordinate this? A doodle poll may be a good start. I'm sorry but I've been super busy lately.
@eaplatanios @sbrunk @DirkToewe Ill set up a doodle poll this weekend.
@mandar2812 just a gentle ping about the poll. We can also schedule it informally here. My schedule is quite flexible over the next week.
@eaplatanios sorry for this huge delay in setting up the doodle :D. Im finishing my thesis next week so I would prefer sometime in the last 10 days of August. Is that okay for you guys?
Fine with me.
Hi Anthony,
The following code leaks Graph and Op objects in my environment:
If I call the
test
method repeatedly, objects will not be garbage collected. When looking into VisualVM it shows the GC root going throughorg.platanios.tensorflow.api.utilities.Disposer.records
Here is an example that should compile: https://gist.github.com/huynhjl/00a9ee6958f1b0143b701eb7b2563005
Let me know if I'm doing anything wrong.