bahrmichael / factorio-tycoon

GNU General Public License v3.0
9 stars 7 forks source link

Multiplayer desynchronization after last update #245

Open LUISDASARTIMANHAS opened 8 months ago

LUISDASARTIMANHAS commented 8 months ago

Multiplayer desynchronization after last update After that bug I updated the mod on the server and continued with the save. After a few minutes, players began to disconnect due to desynchronization and infinite map variant saves. The ways that were tested were: 1-use only the tycoon mod (showed errors) 2-use the tycoon mod with other mods (showed errors) 3-use other mods without the tycoon mod (no errors) 4-use no mod just vanilla (no errors)

See the attached images below with the map variants saved by the game and the server

image image image image

GOOGLE DRIVE

LUISDASARTIMANHAS commented 8 months ago

I believe that the fact of using the same city after the update should have an influence, but the topic will be open for thorough investigation

bahrmichael commented 6 months ago

Thank you for the detailed report! I have not had time to dive deeper into those problems, and during a recent multiplayer event we didn't encounter any issues. I'll keep this issue open for reference when we encounter more multiplayer desync issues.

winex commented 6 months ago

@LUISDASARTIMANHAS hi! can you please confirm if this is still an issue with latest version v0.4.5 (or incoming v0.4.6)?

i didn't test multiplayer yet, but will dig deep into any issues to fix it!

bahrmichael commented 6 months ago

We had another report on the forums, but I have no idea how to investigate that yet: https://mods.factorio.com/mod/tycoon/discussion/65ecdf2b3bf3f8cc70917770

winex commented 5 months ago

@bahrmichael hmm, i couldn't reproduce any desyncs, yet. maybe it needs 2nd player to trigger that

+? future test case: playing multiplayer with 3 players on different OSes for 10-20 minutes

my thoughts

don't know how to debug a Factorio desync correctly, so i can't find what line/variable caused it. here are some points:

my tests:

i've joined a LAN dedicated server (linux-linux) with tycoon_0.4.5 running these saves:

all constructions were built fine, so i suppose even 0.4.6/main version will work the same.

notes:

  1. those saves might have some other mods installed before. while it might not affect, i see some (ex: LTN-related stuff) in binary files like level-init.dat
  2. desync save has apple-farm(initial) position.y=0.0, while starting from v0.3.1x+ it always going to be 0.5 (last argument of find_non_colliding_position_in_box() is true). such saves are so old now, but i'll keep an eye on that, too
winex commented 5 months ago

ok, using v0.4.5, i've started 2 Factorios and joined dedicated LAN server (on another host as before). again, for few (2-3) hours i couldn't reproduce this, construction went fine all the time.

but then i ran by the client-2 and clicked on town hall, then couple of tabs there to see stats... and finally got desync! and it can be reproduced easily just by reconnecting and some clicking there again and again!

logs

latest_input_actions.dat has some of these lines in desynced-level.zip, need to compare it to reference and against client-1 save. this is 2nd desync happened on tick 1490401:

--- cut ---
<action type=GuiClick update-tick=1490160>\....<data>.......i..........</data></action>
<action type=GuiSelectedTabChanged update-tick=1490222>h....<data>......................</data></action>
<action type=GuiClick update-tick=1490222>\....<data>..................</data></action>
<action type=StopBuildingByMoving update-tick=1490404>5....<data>.</data></action>

3rd desync happened on tick 1493341:

<action type=CloseGui update-tick=1493038>.....<data>.</data></action>
<action type=StopBuildingByMoving update-tick=1493039>5/...<data>.</data></action>
<action type=SelectedEntityChangedRelative update-tick=1493049>.9...<data>.....</data></action>
<action type=SelectedEntityCleared update-tick=1493052>.<...<data>.</data></action>
<action type=OpenCharacterGui update-tick=1493107>.s...<data>.</data></action>
<action type=CloseGui update-tick=1493135>.....<data>.</data></action>
<action type=StopBuildingByMoving update-tick=1493162>5....<data>.</data></action>
<action type=SelectedEntityChangedVeryClose update-tick=1493165>ȭ...<data>..</data></action>
<action type=OpenGui update-tick=1493171>.....<data>.</data></action>
<action type=SelectedEntityCleared update-tick=1493172>.....<data>.</data></action>
<action type=GuiSelectedTabChanged update-tick=1493256>h....<data>.m....................</data></action>
<action type=GuiClick update-tick=1493256>\....<data>.p.....z..........</data></action>
--- cut, similar +3 times ---
<action type=GuiSelectedTabChanged update-tick=1493335>hW...<data>.m....................</data></action>
<action type=GuiClick update-tick=1493335>\W...<data>.p................</data></action>

i have no idea know what all of this means, yet.

it looks like there is something bad happening after gui click also (might be another issue, though), but it's not always because of it and needs some ticks to pass... note: actions like SelectedEntityChangedVeryClose in multiplayer might differ for each client (unless it's local, idk)

winex commented 5 months ago

PR #288

winex commented 5 months ago

okay, found another issue, which could be additional or the root cause. needs testing w/o gui event fixes

just compared level_with_tags_tick_*.dat:

-<entity-data name=tycoon-excavation-pit-2>
+<entity-data name=tycoon-excavation-pit-15>

as per Gangsir#Use_local_variables_that_are_not_final_outside_of_event_hooks, this is sooo bad: https://github.com/bahrmichael/factorio-tycoon/blob/fbbe87d273a89e188cd8255b870af680d61f168e/city.lua#L633-L671 so, if any function from this block is called - it will modify these locals for any instance of this require. idk how Lua handles multiple imports, but in mp these will definitely not be synced, unless moved into global table! maybe 2 require()s named differently can confirm such cases in a single client...

also, it's bad to hard-code total number of prototypes here -- it should be rebuilt in some way or taken from data-prototypes, like we did with resources (probable desync there, too :P)

to fix this i propose using simply city.generator(totalNumberOfType) instead of iterating...

winex commented 5 months ago

just compared level_with_tags_tick_*.dat:

-<entity-data name=tycoon-excavation-pit-2>
+<entity-data name=tycoon-excavation-pit-15>
  • [x] CONFIRMED: all my 6 desyncs was caused not by gui clicks as i thought before, but by desynced numbers in: pits, gardens, houses
bahrmichael commented 5 months ago

okay, found another issue, which could be additional or the root cause. needs testing w/o gui event fixes

just compared level_with_tags_tick_*.dat:

-<entity-data name=tycoon-excavation-pit-2>
+<entity-data name=tycoon-excavation-pit-15>

as per Gangsir#Use_local_variables_that_are_not_final_outside_of_event_hooks, this is sooo bad:

https://github.com/bahrmichael/factorio-tycoon/blob/fbbe87d273a89e188cd8255b870af680d61f168e/city.lua#L633-L671

so, if any function from this block is called - it will modify these locals for any instance of this require. idk how Lua handles multiple imports, but in mp these will definitely not be synced, unless moved into global table! maybe 2 require()s named differently can confirm such cases in a single client... also, it's bad to hard-code total number of prototypes here -- it should be rebuilt in some way or taken from data-prototypes, like we did with resources (probable desync there, too :P)

to fix this i propose using simply city.generator(totalNumberOfType) instead of iterating...

That code is from when I didn't know about entity variants yet. Maybe we can use them instead to have randomized pictures for houses, instead of this "randomization" code.