Closed deanputney closed 7 years ago
yes - the stuff from the logs is really weird on this and I've not yet figured out how to pin it down. That's why I've been periodically grasping at potential fixes (honestly I'm still baffled that a couple of them didn't have any effect on the issue) . next time I catch it in action I will grab a snapshot of the relevant logs.
This is the brief summary I wrote for myself on the trello card for the problem:
i see what's happening but still don't understand why. UWSGI swears it's got a 200 response; nginx returns a 502; the frontend flails wildly, advancing thru the queue while simultaneously adding jslog calls to the clogged uwsgi workers
It's kind of amazing how completely it falls over when this happens. The whole thing is just murdered.
What are some of the suspect errors in the nginx log?
It's especially amazing because I have made 3+ different changes that I thought would at least limit the flailing, and it hasn't gotten even a little bit better.
So far I seem to be able to reproduce in this manner:
The nginx logs seem wildly unhelpful tbqh. I'm going to change the log level and hope that makes it more legible. Anyway here's a screenful of ack 502 watershed.error.log
:
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http special response: 502, "/item/?"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http2 output header: ":status: 502"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http script var: "50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 uwsgi param: "REMOTE_PORT: 50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 finalize http upstream request: 502
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http finalize request: 502, "/item/?" a:1, c:1
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http special response: 502, "/item/?"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http2 output header: ":status: 502"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http script var: "50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 uwsgi param: "REMOTE_PORT: 50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 finalize http upstream request: 502
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http finalize request: 502, "/item/?" a:1, c:1
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http special response: 502, "/item/?"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http2 output header: ":status: 502"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http script var: "50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 uwsgi param: "REMOTE_PORT: 50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 finalize http upstream request: 502
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http finalize request: 502, "/item/?" a:1, c:1
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http special response: 502, "/item/?"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http2 output header: ":status: 502"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http script var: "50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 uwsgi param: "REMOTE_PORT: 50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 finalize http upstream request: 502
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http finalize request: 502, "/item/?" a:1, c:1
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http special response: 502, "/item/?"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http2 output header: ":status: 502"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http script var: "50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 uwsgi param: "REMOTE_PORT: 50248"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 finalize http upstream request: 502
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http finalize request: 502, "/item/?" a:1, c:1
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http special response: 502, "/item/?"
2017/03/16 20:05:11 [debug] 3001#3001: *21672 http2 output header: ":status: 502"
Wait, you're getting it every time the queue advances now?
Mmmmmm, almost? I can try it again.
Yes, it is reproducible by following the instructions I posted above. This may not occur for a skip, but when an item (at least Bandcamp) naturally meets its end the queue is cleared.
This is a flippin' breakthrough!! I'm annoyed that I never noticed this, but I'm really glad you did. Expect an update this afternoon!
So... it seems to pretty reliably but not quite always trigger on bandcamp. I've also made more headway from the now-less-insane nginx logs. I've got a new theory on how to better approach it from the frontend as well. Thinking that it's all related. Short story is that the app is easily clogged due to some poor settings in the nginx/uwsgi layers, and some bad logic in the frontend helps exacerbate this problem. I'm hoping the nginx/uwsgi settings will provide some relief, because the frontend changes will actually be a pretty involved fix, though still one worth pursuing eventually. Anyway I'm going to attempt to tune those settings tonight.
I'm not pleased with how long it took me, but I think I'm going to be able to push the fix tomorrow. I think I've got it solved locally but it's already way too late here 😩
Updates are out there -- look for extension version 0.3.1. Looks good from here but I'll wait and watch a bit before calling this closed.
Cool, will try this out today and let you know if it's improved.
I'm now finding that I can complete items without the queue being cleared or getting a 502 error, but the auto-advance is not functioning. We can either create a new issue for that or track it here, whatever works for you.
Edit: We should probably close this and create a new issue. I'll leave that up to you though.
Coulda sworn that was working when I finished up the other night but it was very late and I do suffer from a mild case of chronic dingus. I'm making a separate issue.
(see https://github.com/nthall/watershed/issues/5 for the auto-advance issue)
First closed issue! :1st_place_medal:
memoriiiieeeeesss
When the playlist empties out after an album finishes, I sometimes see a 502 gateway error when I reload the page. You might want to review your logs to see if you can find where that error is occurring.