rfaulkner / wikipedia_user_metrics

Wikimedia Foundation E3 Team Analysis Code
Other
9 stars 5 forks source link

Backend. Time series requests die due to dropped connection #79

Open rfaulkner opened 11 years ago

rfaulkner commented 11 years ago

This happens when the connection is dropped and results in several hanging threads. One possible fix is to lower the thread count in metrics however, this is a kludge. Better handling around dropped connections, but also threads should be able to die gracefully. So this really addresses two issues:

1) Handle dropped connections up the stack trace so that worker processes can always die gracefully.

2) In cases where a metrics worker thread throws an exception the multiprocessing thread pool ends up hanging have some facility for detecting that and terminating those threads.

rfaulkner commented 11 years ago

it may be worth adding a context for these connections:

http://stackoverflow.com/questions/2837822/in-python-how-to-make-sure-database-connection-will-always-close-before-leaving

rfaulkner commented 11 years ago

https://github.com/rfaulkner/E3_analysis/commit/02cd02fad0fee970b6402e0c888df5b8a0084d30 https://github.com/rfaulkner/E3_analysis/commit/bc86f94095eddb4fc69048f9f39b4dbd65e85600 https://github.com/rfaulkner/E3_analysis/commit/d6ecc3a19b1dc9a11d65fff5546c7b2818c7ecd2