Open skye0402 opened 3 weeks ago
Any chance to look into it? I found the likelihood of 404 errors increases if the pod ages (e.g. more than 1 day old). It was definitely not happening with older Gradio versions.
Issue persists for me too, I am running gradio app on multiple AWS EKS pods and 404 error shows up frequently tho not all the time. Had to downgrade gradio version to gradio==3.50.2.
Please look into it.
@w8jie It works without errors with e.g. Gradio 4.1x - at some point the bug was introduced.
Apologies for the late response. We'll need a methodical repro in order for us to investigate this issue. Would either of you be able to provide one?
@abidlabs - I understand that. Thing is, this error just happens not all the time. A session is working fine over a certain time. Then the error occurs leading to error 404. If I open an incognito window I can work again because that's a new session. But the session in the regular browsing window is lost. Istio will use the session ID from the browser to direct it to the pod where the gradio app runs that owns this session ID. But then above error log appears. So far I wasn't able to provoke it, I think it's more likely the "older" the instance gets that runs Gradio. In such a case I have 2 options: Wait until the session expires or restart the pod (manually).
It's become a real problem - I'd say it happens in 10 to 20% of the cases a user wants to continue work. It's always above error and it seems the session ID is forgotten by Gradio (maybe after starlette raised the ASGI error?)
I can offer access to the instance for one of your developers if that's of any help and of course access to the source code.
@abidlabs I took my chances and downgraded starlette to 0.37.2 (which goes back to March this year) and see if this fixes the problem. Next starlette was from July which could be the time the problems started. Will update if that helped.
@abidlabs - Downgrading starlette didn't fix the error at least not until 0.37.2. I don't know which version was part of Gradio in May/June where Gradio didn't show the error. If it went back to an older version I could try to further downgrade.
Describe the bug
Since some time (I can't say exactly which release version it started, (currently on 4.41.0 it wasn't happening with 4.22 (and maybe later) that I know) I get below session error. The Gradio app is running on Kubernetes behind an approuter. The error isn't reproducible for me but I saw other issues with same error #9070 but more suitable #6920. I already use sticky sessions, have maybe 20 concurrent users at peak and 4-5 instances of the app running. It happens maybe in 5% of the cases (it's hard to measure). But it didn't happen on older Gradio (I never upgraded the approuter). So I wonder if there's any way we can fix it? This is a nasty error, because I user can only circumvent it by opening an incognito window or clearing the cookie.
Have you searched existing issues? š
Reproduction
Screenshot
No response
Logs
System Info
Severity
I can work around it