When using pytorch-engine:0.18.0 NDManager.newBaseManager() creates a PtNDManager, it will call ai.djl.pytorch.engine.PtNDManager#newSubManager, and execute:
PtNDManager manager = new PtNDManager(this, device);
attachUncappedInternal(manager.uid, manager);
return manager;
Method attachUncappedInternal is implemented by BaseNDManager and attaches the created PtNDManager to its field resources.
resources.put(resourceId, resource);
The created PtNDManger will never be released even it is closed.
public void close() {
if (!closed.getAndSet(true)) {
// ignore some code
parent.detachInternal(uid);
resources.clear();
tempResources.clear();
}
}
The `parent` is `PtNDManager$SystemManager` and parent's `detachInternal` does nothing.
```java
@Override
public void detachInternal(String resourceId) {}
So in the end, the created PtNDManger will not be sweeped by JVM GC.
When downgrade pytorch-engine to version 0.17.0, the problem is solved. Because the newSubManager calls PtNDManager$SystemManger#attachInternal. PtNDManager$SystemManger#attachInternal does nothing.
PtNDManager manager = new PtNDManager(this, device);
attachInternal(manager.uid, manager);
return manager;
@Override
public void attachInternal(String resourceId, AutoCloseable resource) {}
Expected Behavior
The SystemManager will not attach the created PtNDManger to its field resources or release PtNDManger when it is closed.
Error Message
How to Reproduce?
use pytorch-engine version 0.18.0
execute the code below as many times as possible and will cause OOM eventually.
try (NDManager manager = NDManager.newBaseManager(Device.cpu())) {
// do something here
}
Description
When using pytorch-engine:0.18.0
NDManager.newBaseManager()
creates aPtNDManager
, it will callai.djl.pytorch.engine.PtNDManager#newSubManager
, and execute:Method
attachUncappedInternal
is implemented byBaseNDManager
and attaches the created PtNDManager to its fieldresources
.The created PtNDManger will never be released even it is closed.
So in the end, the created PtNDManger will not be sweeped by JVM GC.
newSubManager
callsPtNDManager$SystemManger#attachInternal
.PtNDManager$SystemManger#attachInternal
does nothing.Expected Behavior
The
SystemManager
will not attach the created PtNDManger to its fieldresources
or release PtNDManger when it is closed.Error Message
How to Reproduce?
maven dependencies