deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.07k stars 650 forks source link

Direct Memory can't free #1398

Closed yjmwolf closed 2 years ago

yjmwolf commented 2 years ago

I find the a same problem of djl pytorch inference api which also exist in libtorch_java_only api. The memory of JVM is stable, but the whole memory of server increased several days until crashed, it seems like caused by direct memory. I found this issue of libtorch_java_only api, so I try to use djl pytorch inference api , but it can't be solved . My predict part code such as:

Predictor<FeatureValDO, float[][]> predictor = createPredictor();
        try{
            float[][] score = predictor.predict(featureValDO);
        }catch (Exception e){
            infoLogger.error("predict error:",e);
        }finally {
            predictor.close();
        }
frankfliu commented 2 years ago

@yjmwolf The issue you described usually caused by native memory leak. If all the resources are closed properly, you should not see crash. We have many customer run continuously inference with pytorch, and they didn't notice memory leak.

Can you create a minimal reproducible project, so we can look into it.

frankfliu commented 2 years ago

Feel free to re-open this issue if you can provide an reproducible project.