Closed ghost closed 7 years ago
I don't. I'm not very familiar with Passenger to offer suggestions on what might be happening under the hood. SuckerPunch was made with the purpose of performing small, fast jobs. Keeping a job open like this for an extended period of time, while might work, just wasn't it's intended use case.
Stepping back, it feels like Passenger is closing a process or thread that contain this longer running job. It's possible their model doesn't work well with the library. I've not hear of other reports like this, but I also haven't heard of others doing longer running jobs.
If feels like this kind of logic might be better served with a stand-alone process. That way, you can to control the startup/shutdown on your own, without being sensitive to threads. If you want to stay with the threading approach, you might also trying creating your own threads in Ruby that does this listening. From the code above, there doesn't look to be anything SuckerPunch-specific that gives you an advantage. To me, that would be things like enqueueing many jobs over time. Given that this one stays running, the same could be done by starting a thread in the background on startup, similar to how error-reporting gems work when reporting caught exceptions.
Hope this helps! Sorry I couldn't offer more insight.
One other thought, would be there a way to use jobs instead of the DB pubsub for notification?
Somewhere else you must send the "finished" event to PG, so rather than push it to the DB, can you enqueue another SuckerPunch job that would have the listening code run? This would make the jobs smaller and more robust surviving across process restarts.
Thanks for those helpful remarks, this passenger issue kind-of blindsided me having run a stress test on Puma for 12+ hours with no issues; ho hum, back to the drawing board, I'll organise my own daemon for this.
Unfortunately enqueueing SP jobs is not an option since my "insertion program" is written in C :smile:
This clearly not an SP issue in any case, so closing; thanks again.
@jjgreen Ah, gotcha. I worked on a system that used Redis pubsub in a similar way and we had a longer-running process listening for those events coming from Redis that worked great. It was a bummer to have to go through the work, but definitely the more reliable route.
Just to close this off, the problem turned out to be passenger "smart spawning" which duplicated file descriptors, one must reestablish the DB connection in order that one process closing its connection does not close the other. Passenger handles this for the default ActionRecord connection, but not my secondary connection.
In any case, I followed good advice and moved the code into a daemon, which proved to be straightforward and only took a couple of hours.
Ah, awesome. Makes sense. Thanks for following up. I'm sure others have been bitten by this before.
Glad to hear the process-route went well for you.
On Sun, May 7, 2017 at 4:29 PM J.J. Green notifications@github.com wrote:
Just to close this off, the problem turned out to be passenger "smart spawning" which duplicated file descriptors, one must reestablish the DB connection in order that one process closing its connection does not close the other. Passenger handles this for the default ActionRecord connection, but not my secondary connection.
In any case, I followed good advice and moved the code into a daemon, which proved to be straightforward and only took a couple of hours.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/brandonhilkert/sucker_punch/issues/198#issuecomment-299732359, or mute the thread https://github.com/notifications/unsubscribe-auth/AAtbFE8nNpdxjnHzG7ZyLzezWZUeeI6Oks5r3imwgaJpZM4NN4b0 .
--
So I have a Rail 4.2 app, it has 2 DBs, one for Rails, one for a load of data. I have a worker which inserts data into the data DB and uses the Postgres Notify mechanism to let the application know it is finished. I then use Sucker Punch to handle the notifications (send them to Pusher in fact). Works fine under Puma, not under Passenger. The stripped down code which exhibits the behaviour is
I have an
UpdateJob.perform_async
in a initialiser.The behaviour under passenger is the following
About 12 events, 5 minutes, then an exception which is swallowed by the
with_connection
wrapper and drops me into theensure
. Removing thewith_connection
allows me to catch the exception, SSL corrupt response mac or some such, turn off SSL, server closed connection.I have no idea what is going on here, this is on real hardware, not Heroku, Debian stable but repeatable on my Ubuntu laptop. I did see some comments in issues from 2013, but apparently Sucker Punch has obsoleted those now.
Any idea what would be happening here?