snimmagadda / pop3d

POP3 Daemon with POP3S, STARTTLS extensions.
6 stars 4 forks source link

pop3d still exiting under heavy load #6

Closed jturner closed 10 years ago

jturner commented 10 years ago

I received a report from a user who runs pop3d with many users that pop3d is still exiting in certain cases.

Aug 16 21:11:06 box1 pop3d[16513]: 3020625651: maildrop updated Aug 16 21:11:06 box1 pop3d[23063]: 3020625651: session closed Aug 16 21:11:06 box1 pop3d[16513]: maildrop process exiting Aug 16 21:11:07 box1 pop3d[9588]: maildrop process exiting Aug 16 21:11:07 box1 pop3d[16032]: pop3d exiting

Aug 16 21:19:03 box1 pop3d[8764]: Lost pop3 engine Aug 16 21:19:03 box1 pop3d[8764]: pop3d exiting Aug 16 21:19:03 box1 pop3d[22391]: pop3 engine exiting

snimmagadda commented 10 years ago

ouch! mbox or maildir? Any core dumps available from the crash? Is the user running OpenBSD-current? Could he reproduce it with kern.nosuidcoredump=3 with /var/crash/pop3d directory. (See last example in sysctl(8))

jturner commented 10 years ago

I believe they are using mbox. I've asked if a core dump was created and if they are willing to run pop3d with nosuidcoredump=3 for a bit.

I'll update this issue when I find out more.

jturner commented 10 years ago

No core dump was generated and they are not in a position to try pop3d with nosuidcoredum=3 right now. Any thoughts based on the limited info?

snimmagadda commented 10 years ago

This set of logs... Aug 16 21:19:03 box1 pop3d[8764]: Lost pop3 engine Aug 16 21:19:03 box1 pop3d[8764]: pop3d exiting Aug 16 21:19:03 box1 pop3d[22391]: pop3 engine exiting

is intriguing. While merging this commit d295784 (Pull request #3) I failed to change the log message to "generic child" instead of "pop3 engine" as we now waitpid with WAIT_ANY. Killing/signalling "pop3 engine" process wouldn't generate "Lost pop3 engine" log message as we event_loopexit() on IMSGEV_DONE(libevent calls the EV_READ handler first and then signal handler). Apparently, one of the maildrop processes crashed. I am focusing on maildrop.c, mbox.c and update this issue as soon as I find anything suspicious.

snimmagadda commented 10 years ago

Could you please try with the latest changes to master. I have fixed some crashes that happen during session closing wrt imsgev, a missing bounds check and race condition in maildrop init.

jturner commented 10 years ago

So far so good, he hasn't seen a crash yet with your recent changes.

jturner commented 10 years ago

The problem seems to be fixed. He's been running the updated daemon for 2 days now under heavy load without a single crash. When your ready, if you tag a new release I can update the port. Thanks again!

snimmagadda commented 10 years ago

Tagged as v1.0.1; Thanks for testing/confirmation.