This happens when the connection is dropped and results in several hanging threads. One possible fix is to lower the thread count in metrics however, this is a kludge. Better handling around dropped connections, but also threads should be able to die gracefully. So this really addresses two issues:
1) Handle dropped connections up the stack trace so that worker processes can always die gracefully.
2) In cases where a metrics worker thread throws an exception the multiprocessing thread pool ends up hanging have some facility for detecting that and terminating those threads.
This happens when the connection is dropped and results in several hanging threads. One possible fix is to lower the thread count in metrics however, this is a kludge. Better handling around dropped connections, but also threads should be able to die gracefully. So this really addresses two issues:
1) Handle dropped connections up the stack trace so that worker processes can always die gracefully.
2) In cases where a metrics worker thread throws an exception the multiprocessing thread pool ends up hanging have some facility for detecting that and terminating those threads.