Description
I am getting memory corruption issues with stateful bls model, It seems like triton is trying to free some memory which is still in use
Triton Information
24.07
Are you using the Triton container or did you build it yourself?
I used triton container 24.07
To Reproduce
I have a stateful decoupled bls model(python backend) which is causing this issue. My bls model is just taking input putting it in an internal queue which will be consumed by a thread that calls the sequence of other models and finally returns the response of last model.
It runs for 1-2 mins then suddenly crash and model gets killed. There is no error logs other then following:
Reference count error detected: an attempt was made to deallocate the dtype 17 (O)
I've attached logs with this issue.
The model input is coming from a streaming client, It sends audio chunks every 50 mili sec. BLS model will pass those chunks further to a set of other models sequentially and finally return output to the streaming client
Description I am getting memory corruption issues with stateful bls model, It seems like triton is trying to free some memory which is still in use
Triton Information 24.07
Are you using the Triton container or did you build it yourself? I used triton container 24.07
To Reproduce I have a stateful decoupled bls model(python backend) which is causing this issue. My bls model is just taking input putting it in an internal queue which will be consumed by a thread that calls the sequence of other models and finally returns the response of last model.
It runs for 1-2 mins then suddenly crash and model gets killed. There is no error logs other then following:
The model input is coming from a streaming client, It sends audio chunks every 50 mili sec. BLS model will pass those chunks further to a set of other models sequentially and finally return output to the streaming client
Expected behavior Should not crash crashlog15.txt crashlog16.txt