Neos-Metaverse / NeosPublic

A public issue/wiki only repository for the NeosVR project
193 stars 9 forks source link

Clogged Sync is Caused Due to Incorrect Error Handling of Failed Sync Tasks #3915

Open stiefeljackal opened 1 year ago

stiefeljackal commented 1 year ago

Describe the bug?

During the sync process, if a failure or an exception state should occur within the sync task process, it causes the sync process loop to stop. Because of that, it prevents future syncing from occurring and piles up all sync tasks until the client or headless is restarted.

Relevant issues

Reported issues that had experienced System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed. (failure state):

Reported issues that had experience other exceptions that break the syncing process (exception state):

To Reproduce

Cause a sync issue to occur by either creating a deeply nested object with slots (#3313), causing a save conflict, or hoping for a HTTP 429 Error Code (or any HTTP error) to occur. If you are a technical person, you can modify the global version of one of your LiteDB entries to cause the sync to clog. From there, try to save any item in your inventory and watch the sync queue to go up.

Expected behavior

Error handling has been implemented correctly that prevents syncing from getting clogged. When the sync has finished, it should be showing that there are Sync Errors.

image

Any future syncs during the same session should still occur.

Log Files

Vanilla Client (with current bug): JACKAL_MK2 - 2022.1.28.1310 - 2023-06-12 14_37_21.log

Client with Mod (fixes the bug): JACKAL_MK2 - 2022.1.28.1310 - 2023-06-12 14_39_18.log

Screenshots

Please review the video that demonstrates the clogged sync (and a demonstration of how it is supposed to work):

https://youtu.be/sKnmyzFoUWY

How often does it happen?

Always

Does the bug persist after restarting Neos?

Yes

Neos Version Number

2022.1.28.13XX

What Platforms does this occur on?

Windows, Linux

Link to Reproduction Item/World

No response

Did this work before?

Yes

If it worked before, on which build?

Before 2021.10.17.1326 (based on pervious issues)

Additional context

Failure State

The task is marked as completed correctly when a failure state is declared; however, the task is being marked as completed again, which is not allowed. This is why you see An attempt was made to transition a task to a final state when it had already completed is littered in the logs regarding sync, and because this exception is thrown...

Exception State

An exception thrown during the sync process will cause the sync loop to stop. The common exception is the one described above, but other exceptions such as the ones posted in #3668 and #3313 have occurred to break the syncing loop as well.

Mod

The following is a mod that will fix the clogging issue:

https://github.com/stiefeljackal/JworkzNeosFixFrickenSync/releases/tag/v0.1.1

Reporters

U-StiefelJackal (Discord: Stiefel Jackal #stiefeljackal)