Open arnavmehta7 opened 12 months ago
Update: Seems that there is a small bump over days, can it be due to accumulation of logs?
@arnavmehta7 Here is a report about the jvm memory usage in TS. The following picture is the memory usage of vgg16 scripted model soak test (long run job), where jvm is set as
vmargs=-Xmx4g -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError
Okay @lxning so you think that the leak cannot be due to JVM? Seems like I will have to deploy a hello world thing on k8s to check
Am I right to conclude that it reclaims memory after all requests are done?
@arnavmehta7 here is the code link about removing a job from jobQ once a worker is ready to process it.
Thankyou @lxning I am doing more testing and will post soon. My fault most prob.
Btw do you know if there's anyway to restart he pod on k8s itself than to let torchserve stuck in a restarting-dying loop?
🐛 Describe the bug
I firstly deployed the "same" model with the same requirements on another torchserve alternative on k8s. It works perfectly and the graph is fully stable. However it didn't provide batching and workers support like torchserve, so I have moved to torchserve.
However, there is a small "buildup" over time in the same code, same requirements. There is a Peak, then a "small" valley on each inference. But the valley is smaller than the Peak, and hence the use of memory is increasing slowly.
I set the number of netty_threads and set the vmargs, and then the leakage reduced, but still overtime. It is building up.
My model and the code supporting had 0 leakage, but the issue seems to be from java or torchserve itself.
NOTE: I pass big dictionaries between preprocess -> inference -> postprocess
Error logs
Memory Example:![image](https://github.com/pytorch/serve/assets/65492948/67451a09-3f93-46d6-ab85-1a3436b58686)
Installation instructions
Using the torchserve docker gpu image from dockerhub
Model Packaing
I just am using a custom model, which I cannot send and running it with a custom handler. Nothing too fancy. I used to examples to make it.
config.properties
Versions
Repro instructions
Any example on k8s should have this most likely
Possible Solution
No response