Closed kdomanski closed 4 months ago
Can you attach a log reproducing? The line numbers in the stack trace don't match the current repo. The error is indeed occuring because self.git_remote
is set to "?", but the state shouldn't be stale. The remote and branch are set in _find_current_branch()
. The remote itself is set by querying the configuration based on that branch, or its parsed from the response to git branch
if the head is detached. If you switch to a branch that has no upstream remote set in git config then an exception will correctly be raised.
FWIW, the update manager isn't really intended to be used in this manner. Its primary purpose is to provide updates from a remote on a static branch. Manually modifying the repo could certainly confuse it.
Wow, thanks for the super-quick response!
Can you attach a log reproducing? The line numbers in the stack trace don't match the current repo.
Sorry, I've been working with a codebase riddled with extra debug log statements. Here's a full log from unchanged code: moonraker.log
Turns out, simply restarting is not sufficient to reproduce from clean state. I can get it to throw the exceptions when I force a refresh, but that alone doesn't save the broken state. Race condition maybe?
The error is indeed occuring because
self.git_remote
is set to "?", but the state shouldn't be stale. The remote and branch are set in_find_current_branch()
. The remote itself is set by querying the configuration based on that branch, or its parsed from the response togit branch
if the head is detached. If you switch to a branch that has no upstream remote set in git config then an exception will correctly be raised.
Sure, I get that, but it seems to have worked more cleanly in the past, i.e:
self.git_remote == '?'
) would only result in error messageNow, no information about the actual problem shows up in Mainsail or fluidd - only the recovery button is highlighted and the info message remains stale:
FWIW, the update manager isn't really intended to be used in this manner. Its primary purpose is to provide updates from a remote on a static branch. Manually modifying the repo could certainly confuse it.
It didn't occur to me that I'm doing anything unusual. I just had it running while working on some Klipper code in a branch. I know you only have so much time to work on this, so I'd be happy to fix this regression, unless you say "nah, I'm familiar with the codebase, I already know how to fix it".
It's also worrying that the broken status was somehow saved and prevented a refresh even after switching back to master
. Even if you decide that update_manager must not be used with development repos, then there's a potential problem somewhere in there.
Cheers KD
BTW I still have one more corrupted repo state in storage, so I backed up the database in case it would help with debugging.
I don't think this is a bug or a regression. Switching to a new branch that has an untracked remote would have still raised an exception prior to https://github.com/Arksine/moonraker/commit/a7b9e5783de921c7b071ceec726eb403915ab34c. Simply put, the update manager can't fetch the updates from a remote when there is no remote set in the git config. The way to correct the issue is to switch to a branch that is tracked, or push the branch to a remote.
FWIW, i just verified. I was able to switch to a new branch on a configured repo and and reproduce the error. After switching back to master
a refresh correctly restores state.
The behavior from Moonraker is slightly different from https://github.com/Arksine/moonraker/commit/a7b9e5783de921c7b071ceec726eb403915ab34c, but both would have raised exceptions. In the current version the exception is raised when Moonraker attempts to verify the repo. The key branch.<branch_name>.remote
does not exist in git config. As a result, Moonraker sets the remote to "?", as its unknown. When Moonraker validates the remote before performing a fetch it raises an exception.
In the linked commit the exception would have occurred here. In this version the call to git config
would have returned status code 1, which would have raised a shell command error.
One other thing I forgot to mention, the update manager's status
endpoint has a basic "spam" filter that simply returns the current state if a new request is received within 60 seconds of a fulfilled request. Generally speaking refreshing the state isn't something that needs to occur often, and it tends to be CPU intensive (particularly on low resource machines). The filter exists to prevent several frontends/apps from connecting and requesting a status refresh all at once.
That said, Moonraker also registers the POST /machine/update/refresh
endpoint that has no such filter, however it can be used to request a refresh for a particular item, ie POST /machine/update/refresh?name=klipper
. I don't believe the primary front ends ever implemented it though.
Hey, when I wrote that it was introduced in a7b9e5783de921c7b071ceec726eb403915ab34c, I meant that the correct behavior is before it, meaning 35396a5b2a2d4b5781c9d89007f6278b61edcd8d.
But it appears the behavior on startup in 35396a5b2a2d4b5781c9d89007f6278b61edcd8d seems to be the same. What stands out is that in 35396a5b2a2d4b5781c9d89007f6278b61edcd8d the function _update_repo_state()
runs report_invalids( )
which sets _is_valid to False and exits gracefully. Meanwhile a7b9e5783de921c7b071ceec726eb403915ab34c _update_repo_state()
doesn't do that.
Still, the corruption of the saved state is freaky, but I cannot reproduce it at this time. Has to be a race condition of some sort. 🤔
Hey, when I wrote that it was introduced in https://github.com/Arksine/moonraker/commit/a7b9e5783de921c7b071ceec726eb403915ab34c, I meant that the correct behavior is before it, meaning https://github.com/Arksine/moonraker/commit/35396a5b2a2d4b5781c9d89007f6278b61edcd8d.
Right, my mistake. As you stated, the behavior is the same.
What stands out is that in https://github.com/Arksine/moonraker/commit/35396a5b2a2d4b5781c9d89007f6278b61edcd8d the function _update_repo_state() runs report_invalids( ) which sets _is_valid to False and exits gracefully. Meanwhile https://github.com/Arksine/moonraker/commit/a7b9e5783de921c7b071ceec726eb403915ab34c _update_repo_state() doesn't do that.
That is true, however an exception in GitRepo.initialize()
would still propagate up to refresh()
.
Still, the corruption of the saved state is freaky, but I cannot reproduce it at this time. Has to be a race condition of some sort.
Are you sure the saved state was corrupted? Mainsail will at times report corrupted when the repo is invalid.
All that said, I did take a deeper look at this, and while it appears to me that its behaving as expected, the condition could be more clear to the user. In addition, the state should be saved on failure so its not reported as valid
on a restart. Commit 119f579a44ab569524cbae12852362fc8ece1e68 addresses both of those issues.
Are you sure the saved state was corrupted? Mainsail will at times report corrupted when the repo is invalid.
"Corrupted" was the wrong word of me to use. The stored correctly reflected the (previous) invalid state of the repo.
But it could never be updated to the current (manually fixed with git switch master
) state, because in refresh_repo_stage
the call to _verify_repo()
happens BEFORE the call to _find_current_branch()
.
Anyways, I just pulled your new commit and this is a MASSIVE improvement in the user experience. Thanks a lot!
What happened
I noticed that for some repos updated with update_manager, I get the following stacktraces in logs:
On top of that, manually triggering the repo update in UI never seems to really refresh the repo state. I see warnings that are out-of-date.
After digging into the code, I determined the following:
GitDeploy.initialize()
, the repo state is restored from storageGitDeploy._update_repo_state()
wants to runGitRepo.refresh_repo_state()
and then verify the result withrepo.is_valid()
_update_repo_state()
just throws an exceptionmaster
?
_check_warnings()
for this exact condition:if self.git_branch != self.primary_branch:
refresh_repo_state
would only run this waaay after the exception is thrown_check_warnings
is ran is that initial check based on stale staterepo.report_invalids()
Client
Fluidd
Browser
Firefox
How to reproduce
Additional information
No response