Open bontebok opened 2 days ago
Thanks for reporting this!
I've pulled logs and restarted the sync queue worker.
It seems to be processing now, so that should resolve the immediate issue, but I'll have to go through the logs and figure out why it stopped in the first place, so it doesn't happen again.
I can confirm that items are syncing now.
The headless and my client synced. Thank you @Frooxius - FYI, CJ Build Battle tonight so there will be a lot of world syncing.
If it gets stuck again, poke me and I'll kick it again.
If it gets stuck again, poke me and I'll kick it again.
No sync issues during the jam, thanks for the quick response!
Thanks for the update!
I want to keep this open though, I still want to dig into the underlying cause of this.
Thanks for the update!
I want to keep this open though, I still want to dig into the underlying cause of this.
Sure thing!
Related, if you ever want to talk about the preprocessing routines, I'm curious to learn what's going into the latency of the cloud's response. I know it's performing asset lookups to determine if an asset exists or not to inform the client, but unless it's also performing an R2 check, rehashing etc., this should be a pretty quick operation at the database level. If you're ever diving into this and want another pair of eyes or a rubber duck, feel free to reach out.
Thank you, I appreciate the offer! I'm not sure when I'll be digging into it some more, though it's something I'd like to make more efficient where possible.
To give more context, the preprocessing routines are a fair bit more complex than just checking the existence of assets.
The major part of the preprocessing is pre-pinning all the assets on both the per-account and global reference counting lists to make sure the user is allowed to upload everything needed for syncing particular record - that way the syncing won't suddenly fail in middle of uploading the assets.
It essentially "stages" all the changes the sync wants to make, so the actual upload can then happen "confidently" and once it's done it confirms all the staged changes that were made in preprocessing.
I don't think it's related to this issue though, since I haven't really changed anything with this part recently and the whole process is wrapped in retry logic, so even if it fails syncing one record, it should still keep processing the queue and re-try the failed one later. The fact that it just stopped completely and died is a bit odd.
I've done some investigation and fix-ups based on what I could find, as well as adding some additional diagnostics, alerts and error wrapping.
I'll monitor the issue and see if it re-occurs. I'm not 100 % sure if the issues I found in the log were the actual culprit here.
Describe the bug?
A headless we are running for a build session stopped syncing at 8:33 PM UTC (about an hour ago). Starting a vanilla client under a different user and attempting to sync also is failing to sync.
Attached are the logs from the vanilla client saving an empty gridspace world to a group.
To Reproduce
Save an empty gridspace world.
Expected behavior
The sync goes through in a short period of time.
Screenshots
Resonite Version Number
2024.11.19.479
What Platforms does this occur on?
Windows
What headset if any do you use?
Desktop and Headless
Log Files
DESKTOP-M4CKTHS - 2024.11.19.479 - 2024-11-23 16_15_31.log
Additional Context
No response
Reporters
@Rucio