MetPX / sarracenia

https://MetPX.github.io/sarracenia
GNU General Public License v2.0
45 stars 22 forks source link

flows taking a long time to exit #595

Open petersilva opened 1 year ago

petersilva commented 1 year ago

So, in the name of ensuring state is saved, there is a signal handler in flow (please_stop) ... but the effect of that signal handler is to catch a sigterm signal, set please_stop flag, but that's it... if a flow is in a sleep... that sleep continues... so if you have a polling interval of 900 seconds, then it will wait to do the cleanup until after that sleep ends normally.

This is a pita in development, and likely will result in sr3 stop losing patience and doing a kill -9... I added a please_stop_immediately flag for now... but I think the real solution is for the please_stop signal handler to re-raise (?) the signal, to that sleep returns quickly and exit processing happens much faster.

petersilva commented 1 year ago

d0f3f7db774c3d370f2cdd1d373d26ceefc426be trying to address.

petersilva commented 1 year ago

so far can't figure out how to interrupt sleep() call, so instead breaking individual sleep's into naps, and checking whether a stop is requested from time to time.

petersilva commented 1 year ago

there are naps in the flow/run loop... I see there are ebo sleeps in the moth classes, I figure those are OK? not sure...

petersilva commented 1 year ago

also: d0f3f7db774c3d370f2cdd1d373d26ceefc426be

petersilva commented 1 year ago

and finally 17bc699b31422eee3a97ffaf06d03821768c3887

petersilva commented 1 year ago

question remains about ebo sleep loops, should we add naps?

petersilva commented 1 year ago

Still need to look at moth/ queue declaration loops... seem to hang when there errors there.

petersilva commented 1 year ago

moth classes are addressed by #648

There is still hangs in transfer classes during connection... not sure if that is needed.