Open zsmanjot opened 2 months ago
The problem is getting increased as there is a lot of delay. I have checked it and found mongodb is running on high CPU here.
Any ideas what can be done here? I know that the triggers are too much these days to handle , so could it be the reason? If yes, how we can address this?
Also i could see that in DB i have 4767 workflows in delayed state.
@arm4b Any solution here?
Can you show what state your movements are in? I asked the same question today at Slack. I have been troubled for a long time and I am still trying to solve it.
My workflow never makes it to execution , the older ones. If any how it makes then it just keeps on holding at some or the other task for hours and completed by 2 or 3 Hours.
Also, i am getting this error as well.
root@stackstorm:~# st2 execution list -l --status delayed -n 2000 2>/dev/null ERROR: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
This is a known issue. If rabbitmq retry connections are exhausted then an action is stuck running forever. likely your box is experiencing some network issues internally. Do your workflows create very large context or have very large inputs or outputs?
Thanks @guzzijones for replying. No underlying network issues are there. Regarding the large inputs and outputs , no these are not very huge. But the things that has been noticed is the amount of triggers it is receiving now a days is huge.
But the main concern is ST2 keeps them in queues for days and never even executes it.
谢谢你的回复。不存在底层网络问题。关于 大的输入和输出 ,不,这些不是很大。但已经注意到的是,它现在每天收到的触发数量是巨大的。
但主要问题是 ST2 让他们排队好几天,甚至从不执行它。
Can you see what state most workflow instances are in?
They are all stuck in delayed state. More than 5000 workflows.
What is in your st2-workflow-engine logs and st2-action-runner logs. I bet you see disconnects to rabbit-mq.
@guzzijones No i could not see rabbit-mq disconnects. Even if i try to purge older workflows it does not do anything and i have to grep IDs and cancel these older workflows manually.
This is a big performance issue.
This is one of the example:
See the requested and scheduled time , 3 hours delay. How could we reduce this delay ? What are the factors that might be we are missing here? Any ideas?
This is one of the example:
See the requested and scheduled time , 3 hours delay. How could we reduce this delay ? What are the factors that might be we are missing here? Any ideas?
It should be blocked. If the performance problem cannot be solved, you can add filtering rules to the rule to confirm which data needs to be processed automatically.
Hi There, If anyone could help me with an issue, we are using ST2 extensively and many workflows runs on the box. The box has good configuration, in terms of memory and CPU.
It has been noticed that workflows gets queued and never gets executed and delayed queue is far too long always when checked.
For example:
If i check the delay queue running the following command (st2 execution list -l --status delayed) , i would be able to see workflows for 2 days before that never made to execution. Because of this , it is seen that other workflows also gets impacted in a way that it takes 50 minutes for a simple workflow to finish that generally takes 10 minutes.
Anybody who can help me here?
Example: