Closed idzikovsky closed 1 week ago
Thanks @idzikovsky for reporting this interesting issue, @amitsrivastava @ranade1 - can you guys take a look?
This issue is stale because it has been open 30 days with no activity and is not labeled "Prevent stale". Remove "stale" label or comment or this will be closed in 10 days.
It's strange that no one except me has faced this issue, as I got it on different Hue builds on different operating systems and different Hue configurations.
Is there an existing issue for this?
Description
I think we've faced some strange behavior somewhere in Hue web server internals: After couple of users logs in (the number is somewhere near 10), Hue server stops to respond on any request, until gunicorn master thread restart worker threads by timeout (configured by
gunicorn_worker_timeout
option).I spend some time debugging this issue, and I wasn't be able to find any root cause here. What I've found is that it seems like something cause gunicorn connections to hang, but the problem here is that I don't see any exception that might cause this.
I observed this on Hue 4.11 with a default configuration. The only thing that I've changed is configured it to use PAM authentication, and got the reproduce with both MySQL and PostgreSQL databases. Also, I've just reproduced the same problem on Hue build from the master branch from the bdeccbd commit.
And I changed
gunicorn_worker_timeout
value to 120 seconds to be able to reproduce this problem more quickly.I will continue to investigate this, but it would be helpful to have some clues or directions if possible.
One small note to add: on Hue 4.11 this problem appears only on Python 3 build. On Python 2 build everything is fine.
Steps To Reproduce
gunicorn_worker_timeout
to smaller value, like 120 seconds, to be able to reproduce this faster.from selenium import webdriver from selenium.common.exceptions import WebDriverException from selenium.webdriver.chrome.options import Options from selenium.webdriver.chrome.service import Service from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.ui import WebDriverWait import sys from threading import Thread import time from webdriver_manager.chrome import ChromeDriverManager
parallel = 10 if len(sys.argv) > 1: parallel = int(sys.argv[1])
def main(): def wait_for_page_loaded(driver): waiter = WebDriverWait(driver, 120) waiter.until(lambda driver: driver.execute_script('return jQuery(":animated").length == 0;'))
if name == 'main': threads = [] for _ in range(parallel): t = Thread(target=main, daemon=True) t.start() threads.append(t)
[16/Jul/2024 10:35:48 -0700] glogging CRITICAL WORKER TIMEOUT (pid:1672386)