Closed samwiseg0 closed 5 months ago
Thanks. I'm working on tracking this down but thus far have been unable to reproduce. Can you add the following line to ~/printer_data/systemd/moonraker.env
?
MOONRAKER_VERBOSE_LOGGING="y"
Then restart Moonraker, repro the issue, and upload the log. This will create large logs, but will hopefully help me track down where the blocking issue is occurring (and how its related to authorization, it really shouldn't be).
Update:
Even though I can't reproduce I think I have found the source of the problem. There is a potential blocking call to socket.getfqdn
in the authorization
module that was generally bypassed previously. Its a call that is very rarely reached and normally it returns immediately. I'll work on a fix. You can hold off on the verbose logging right now, as I am reasonably confident that this is the problem.
Update:
Even though I can't reproduce I think I have found the source of the problem. There is a potential blocking call to
socket.getfqdn
in theauthorization
module that was generally bypassed previously. Its a call that is very rarely reached and normally it returns immediately. I'll work on a fix. You can hold off on the verbose logging right now, as I am reasonably confident that this is the problem.
This makes sense. I discovered late last night that my first DNS server was unresponsive so all queries were timing out on the first server and then heading to the second DNS server.
This should be resolved in d1f97f2658a4e003da95c131c87ce4c59e86de8f. In addition to running getfqdn()
in a thread to prevent blocking, it will now only be called if a domian name is configured in trusted_clients
.
Awesome! I will simulate the DNS failure and test before and after this commit and report back.
Thanks again for looking into this so quickly!
As best as I could test. I am mostly confident this resolved the issue.
When doing my testing I was not able to have fluidd lose connection entirely but it would freeze for a few seconds loading gcode file images. After your change there was no delay. With or without the DNS server being down.
Great, thanks. The new changes should avoid DNS lookups entirely given your configuration, which is what we want. It is likely that DNS resolution was the cause of the delay prior, even when the server was up.
What happened
On the latest version it appears there is some timeout that is triggered that causes fluidd to lose communication to moonraker.
This happens when trying to load the list of gcode files. When opening a folder where the gocde files are, the preview image may load on a few gcode files and then the interface hangs and becomes unresponsive.
To test, I rolled back to commit, https://github.com/Arksine/moonraker/commit/6d8cb762ffca1bd151d98e0cec5b66b966965014, using the command
git reset --hard 6d8cb762ffca1bd151d98e0cec5b66b966965014
and everything works as expected.This is happening all 3 of my printers. All of them are on the latest commits.
Client
Fluidd
Browser
Chrome, Safari, Other or N/A
How to reproduce
Run the latest version. I am using authz but I am unsure if that is related to this issue. Navigate to jobs and open a folder containing gcode files and the interface will become unresponsive.
Additional information
moonraker-5.log
Seeing this repeatedly in the logs.
Edit: I did a bit more testing. Looks like this maybe Authz specific. Adding my machine as a trusted client and connecting directly to the pi without a reverse proxy. It appears to be working just fine.
So this maybe related to the regression fix that was implemented in https://github.com/Arksine/moonraker/commit/a23187b4afa632da8a90cb9eae83f0f6a5eec903.I believe I found the commit that causes this issue after walking back a few commits at a time. https://github.com/Arksine/moonraker/commit/b3b60757aa0d06eac4c3395ac04a8a8b66e76f60