Open AgentScrubbles opened 2 weeks ago
I dont think that error is related to ui problems. What does /api/v3/site
return? And are there any error messages logged for lemmy-ui on the server or in browser console?
Okay the damndest thing, I had rolled back to 0.19.3, and to test what you were asking I just bumped back to 0.19.5 and everything worked perfectly, no issues at all.
So, only thing I can think of is that it was some first run thing. To test it I brought the entire stack down and back up, and everything again is working. I did see notes about inbox, but it seemed to be asking for the inbox from lemmy.ml? But those all passed with a 200 so I'm not sure what happened.
I'm not sure what caused it, maybe it was pictrs updating? Either way, I'm going to mark this as resolved. If anyone stumbles on this issue, I guess downgrade back to 0.19.3, let it fully start, and then upgrade back to 0.19.5? (Do a backup first of course)
I'm reopening because it did start happening again, and I think I have more context. With the logs that @Nutomic requested, I can see that it's repeating /InboxTimeout when there's an underlying cause (maybe, I don't know how the code works, maybe just a guess)
Full logs including lemmy-ui for a timeout, it did eventually resolve:
2024-06-21T16:07:07.255646Z WARN lemmy_server::root_span_builder: InboxTimeout: InboxTimeout
lemmy-1 | 0: lemmy_server::root_span_builder::HTTP request
lemmy-1 | with http.method=POST http.scheme="https" http.host=poptalk.scrubbles.tech http.target=/inbox otel.kind="server" request_id=640a0fda-5996-458f-a8a8-f5eab6f40d30
lemmy-1 | at src/root_span_builder.rs:16
lemmy-1 | 2024-06-21T16:07:07.975568Z WARN lemmy_server::root_span_builder: InboxTimeout: InboxTimeout
lemmy-1 | 0: lemmy_server::root_span_builder::HTTP request
lemmy-1 | with http.method=POST http.scheme="https" http.host=poptalk.scrubbles.tech http.target=/inbox otel.kind="server" request_id=ffde6c38-f4ac-4603-b2f3-daacfa50f631
lemmy-1 | at src/root_span_builder.rs:16
lemmy-1 | 2024-06-21T16:07:10.535633Z WARN lemmy_server::root_span_builder: InboxTimeout: InboxTimeout
lemmy-1 | 0: lemmy_server::root_span_builder::HTTP request
lemmy-1 | with http.method=POST http.scheme="https" http.host=poptalk.scrubbles.tech http.target=/inbox otel.kind="server" request_id=84afb98d-3f8c-4d35-b83a-0c84b9b85ab6
lemmy-1 | at src/root_span_builder.rs:16
lemmy-1 | 2024-06-21T16:07:11.030047Z WARN lemmy_server::root_span_builder: IncorrectLogin: IncorrectLogin
lemmy-1 | 0: lemmy_server::root_span_builder::HTTP request
lemmy-1 | with http.method=GET http.scheme="https" http.host=poptalk.scrubbles.tech http.target=/api/v3/private_message/list otel.kind="server" request_id=d58c52d2-56a8-4ec1-9771-1ca25d076dc6
lemmy-1 | at src/root_span_builder.rs:16
lemmy-1 | 2024-06-21T16:07:12.615836Z WARN lemmy_server::root_span_builder: InboxTimeout: InboxTimeout
lemmy-1 | 0: lemmy_server::root_span_builder::HTTP request
lemmy-1 | with http.method=POST http.scheme="https" http.host=poptalk.scrubbles.tech http.target=/inbox otel.kind="server" request_id=0bd05f02-ce1e-4a9e-96e5-4261730a60a5
lemmy-1 | at src/root_span_builder.rs:16
lemmy-1 | 2024-06-21T16:07:13.175589Z WARN lemmy_server::root_span_builder: InboxTimeout: InboxTimeout
lemmy-1 | 0: lemmy_server::root_span_builder::HTTP request
lemmy-1 | with http.method=POST http.scheme="https" http.host=poptalk.scrubbles.tech http.target=/inbox otel.kind="server" request_id=ff62e555-6ff6-4fbf-880e-70f832f0952b
lemmy-1 | at src/root_span_builder.rs:16
lemmy-1 | 2024-06-21T16:07:14.713999Z WARN lemmy_server::root_span_builder: CouldntFindPerson: CouldntFindPerson
lemmy-1 | 0: lemmy_apub::api::read_person::read_person
lemmy-1 | with data=Query(GetPersonDetails { person_id: None, username: Some("ShortBoweledClown@lemmy.one"), sort: Some(New), page: Some(1), limit: Some(20), community_id: None, saved_only: Some(false) }) local_user_view=None
lemmy-1 | at crates/apub/src/api/read_person.rs:18
lemmy-1 | 1: lemmy_server::root_span_builder::HTTP request
lemmy-1 | with http.method=GET http.scheme="http" http.host=lemmy:8536 http.target=/api/v3/user otel.kind="server" request_id=919d32ff-a531-439b-a366-8ea29e1a206c
lemmy-1 | at src/root_span_builder.rs:16
lemmy-1 | 2024-06-21T16:07:14.855723Z WARN lemmy_server::root_span_builder: InboxTimeout: InboxTimeout
lemmy-1 | 0: lemmy_server::root_span_builder::HTTP request
lemmy-1 | with http.method=POST http.scheme="https" http.host=poptalk.scrubbles.tech http.target=/inbox otel.kind="server" request_id=2542edc0-fcb1-461a-9a92-3424aa194f44
lemmy-1 | at src/root_span_builder.rs:16
lemmy-ui-1 | Error: couldnt_find_person
lemmy-ui-1 | at LemmyHttp.<anonymous> (/app/node_modules/.pnpm/lemmy-js-client@0.19.4/node_modules/lemmy-js-client/dist/http.js:930:19)
lemmy-ui-1 | at Generator.next (<anonymous>)
lemmy-ui-1 | at fulfilled (/app/node_modules/.pnpm/lemmy-js-client@0.19.4/node_modules/lemmy-js-client/dist/http.js:5:58)
lemmy-ui-1 | at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
lemmy-1 | 2024-06-21T16:07:16.535695Z WARN lemmy_server::root_span_builder: InboxTimeout: InboxTimeout
lemmy-1 | 0: lemmy_server::root_span_builder::HTTP request
lemmy-1 | with http.method=POST http.scheme="https" http.host=poptalk.scrubbles.tech http.target=/inbox otel.kind="server" request_id=f1e0de6c-7c22-41a0-915a-05cd2a0d9dcf
lemmy-1 | at src/root_span_builder.rs:16
No errors in the browser log.
So the page does eventually load, after a very long wait (20-30 seconds sometimes), with those logs repeating
Those inboxtimeouts are almost certainly not related to the UI or API speed, so this is a different issue.
What UI pages are you getting slow responses from, all of them? Does api/v3/site
suffer from the same problem?
From what I can tell it's mostly the initial load of the page, API calls are slow, but it looks like once things get hydrated they back off and get faster. However the initial page loads usually still take a while, and while I can't say for sure, it did eventually crash after about 30 hours of usage before (where .3 did not). This is why I believe before it started working again, I probably waited longer for the site to be fully hydrated. However, intermittently it does happen again and the page loads jump back up to 1+ minutes for a response.
I think I agree, that the InboxTimeout may be a Red Herring, after reviewing the lemmy-ui logs I see many Request Timeouts, and I think that's what I'm seeing in my browser when I load, is that it's waiting on those to resolve and they're not. Something about sockets in there?
I'm attaching everything I can here. github_logs.txt
@dessalines I'm sorry, I don't see any calls to api/v3/site, do you want me to just do a standard postman call to it by itself?
Ive noticed various times that the comments for a post are not loading, maybe thats related. Cant think of any reason though.
Cant think of any reason though.
Trust me, I know that this is a developer's worst nightmare, the intermittent slowdown issues. Not a lot to go on. I want to say it's roughly every 24 hours - but there's no way to know for sure. Then I try later and it's perfectly fine. I'm keeping the devtools open today just to see if there's anything that stands out. If there's any other logging or metrics or anything you folks can think of I'll happily send it along.
More graphs, how sometimes the three update calls are almost instantaneous, but a comment can take 11+ seconds
And now running htop, I'm seeing massive CPU spikes from postgres (which I assumed anyway) and Lemmy (calling it), but also massive, massive spikes from node dist/js/server.js
. Is this lemmy-ui's host? Since lemmy is written in Go I'm surprised to see node in there, so I can only assume it's the host for lemmy-ui. Is there processing that that does? Maybe pictrs? Confirming with docker stats too, lemmy-ui is the resource hog right now.
And since node is single threaded it makes sense that if it gets a huge amount of stuff to do that it wouldn't be able to return the page in time, and cause timeouts. (I'm purely guessing on this one, just from my own experience there)
docker stats:
Requirements
Summary
After upgrading to 0.19.4 (or 0.19.5) from 0.19.3, I cannot access the UI, it times out. Checking Lemmy's logs, the following message repeats infinitely:
Steps to Reproduce
Unfortunately the only steps I have are from the upgrade itself, bumping the versions of the docker images. I'll add the docker versions below.
The only changes for the upgrade was pictrs from 0.5 to 0.5.16 and both Lemmy and Lemmy-UI to 0.19.4 (and then 5)
Technical Details
Version
0.19.4
Lemmy Instance URL
poptalk.scrubbles.tech