GTNewHorizons / GT-New-Horizons-Modpack

New Modpack with Gregtech, Thaumcraft and Witchery
https://www.gtnewhorizons.com/
Other
980 stars 299 forks source link

AE2 interfaces getting stuck #9755

Closed GTNHAfx237v7 closed 1 year ago

GTNHAfx237v7 commented 2 years ago

Your GTNH Discord Username

Afx237v7

Your Pack Version

2.1.2.2

Your Server

SP

Type of Server

Single Player

Your Expectation

Updated AE2 to appliedenergistics2-1.7.10-rv3-beta-76-GTNH-1-g6ee0f9a.jar and holoinventory to holoinventory-1.7.10-2.1.5-beta-7-g514918d.jar from jenkins. Did not make any changes to the ME system prior to updating. Expected everything to work as normally.

The Reality

Interfaces keep getting stuck, i.e. will not accept items anymore (from output buses or pipes inserting into them) or provide items in their internal inventory (which are then extracted via conveyor). They will again function correctly if you click one of the top row slots, but only for a seemingly random amount of time until they are stuck again. AE system has sufficient power, storage space, everything is properly chunkloaded.

Your Proposal

As this update contains changes to the pathfinding, maybe this is related? There are no entries in fml-client-latest.log or latest.log that hint at errors (no CME, NPE etc.)...

Final Checklist

repo-alt commented 2 years ago

Pathfinding code is run when the network is rebuilt, and the result of it is "online"/"offline" status and channel distribution. This is a weird bug and I currently have no idea what might have caused it

GTNHAfx237v7 commented 2 years ago

Any suggestions for me to try profile / narrow it down ingame? We also discussed this briefly on discord, where I mentioned the NPE error within holoinventory, but this has not happened after updating the version.

bombcar commented 2 years ago

Didn’t we just add code to speed up “first look” in AE? Could that be related?

there are slightly later versions but they’re going to be near identical code. If you want to try reverting ae2 (maybe just disable holo entirely) until it doesn’t happen that might help narrow it down

GTNHAfx237v7 commented 2 years ago

It also seems to take some time after world launch to appear, have not noticed it immediately after joining the world. It definitely did not happen with appliedenergistics2-1.7.10-rv3-beta-71-GTNH-1-gffe586d

bombcar, do you have a build that predates the pathfinding commit? Did not find anything suitable on jenkins... https://github.com/GTNewHorizons/Applied-Energistics-2-Unofficial/commit/b876bbab78fdd43922d4072c89677907628ee393

GTNHAfx237v7 commented 2 years ago

the a.m. appliedenergistics2-1.7.10-rv3-beta-71-GTNH-1-gffe586d dated 20.01.2022, before the implementation of ultra dense (backbone) cables.

bombcar commented 2 years ago

Here's all the tagged versions - if you need something more precise I can check out a commit and build

repo-alt commented 2 years ago

beta-71 is essentially previous version

GTNHAfx237v7 commented 2 years ago

@bombcar image :(

bombcar commented 2 years ago

Some proxy you have doesn't like strange ports :(

GTNHAfx237v7 commented 2 years ago

Well looking at the commit history and as repo said, beta71 is the latest version before implementation of the ultra dense/backbone cables. I did not mention it before but there are only covered and dense cables in my world, no ultra dense/backbone cables.

bombcar commented 2 years ago

Oh you can also find the jars here: https://github.com/GTNewHorizons/Applied-Energistics-2-Unofficial/releases ( click "Assets" to expand - maybe try -73 as that has lots of stuff)

repo-alt commented 2 years ago

I'll add one additional state reporting to Waila, to display (seemingly impossible) state of "booting" when channels and power are already ok but node is not active.

GTNHAfx237v7 commented 2 years ago

I will also try with beta 72 and beta 73 to see if I can narrow it down - the additional WAILA state sounds like a suitable debugger.

repo-alt commented 2 years ago

Here is the somewhat extended Waila support: http://jenkins.usrv.eu:8080/job/Applied-Energistics/lastBuild/ If those stuck interfaces will show up as "booting" then it will be more clear.

GTNHAfx237v7 commented 2 years ago

So I tried with appliedenergistics2-1.7.10-rv3-beta-77-GTNH-1-gc1570f3.jar and it keeps happening. WAILA does not report "Booting" as state for any of the interfaces that are stuck. They are either ONLINE or, after I broke and replaced them, MISSING CHANNEL until they switch to ONLINE. What I can confirm is that the longer the game runs, the worse it gets. Today, the first ~ 60mins were fine until the first interface failed. After about two hours I had about a dozen interfaces stuck, which needed to be "reactivated" by clicking the GUI or braking and replacing them.

repo-alt commented 2 years ago

I think I'll add some debug diagnostics log, printing AE ticking requests. I suspect they go to sleep, but not sure about that (and can't think of any reason too)

GTNH-Afx237v7 commented 2 years ago

Would it be possible to indicate the interface x/y/z position as well?

GTNHAfx237v7 commented 2 years ago

So I ran with the latest AE2 incl debugging and eventually found one interface that shut off: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

[15:33:18] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 22800 | ns, | new | state | is | SLEEP -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [15:33:34] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 9651000 | ns, | new | state | is | SLEEP [15:33:35] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 2108000 | ns, | new | state | is | SLEEP [15:33:36] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1624300 | ns, | new | state | is | SLEEP [15:33:36] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1531400 | ns, | new | state | is | SLEEP [15:33:37] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1747500 | ns, | new | state | is | SLEEP [15:33:37] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1739800 | ns, | new | state | is | SLEEP [15:33:37] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1282500 | ns, | new | state | is | SLEEP [15:33:38] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1068700 | ns, | new | state | is | SLEEP [15:33:38] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 759300 | ns, | new | state | is | SLEEP [15:33:38] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1173400 | ns, | new | state | is | SLEEP [15:33:39] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1110200 | ns, | new | state | is | SLEEP [15:33:39] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1047200 | ns, | new | state | is | SLEEP [15:33:39] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 651600 | ns, | new | state | is | SLEEP [15:33:40] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1214200 | ns, | new | state | is | SLEEP [15:33:40] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1317700 | ns, | new | state | is | SLEEP [15:33:41] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 9298100 | ns, | new | state | is | SLEEP [15:33:41] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1388800 | ns, | new | state | is | SLEEP [15:33:41] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 998600 | ns, | new | state | is | SLEEP [15:33:41] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 764900 | ns, | new | state | is | SLEEP [15:33:42] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 850000 | ns, | new | state | is | SLEEP [15:33:42] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 7200 | ns, | new | state | is | SLEEP [16:15:35] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 1898300 | ns, | new | state | is | SLEEP [16:15:36] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 398500 | ns, | new | state | is | SLEEP [16:15:36] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 272900 | ns, | new | state | is | SLEEP [16:15:37] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 413200 | ns, | new | state | is | SLEEP [16:15:37] | [Server | thread/DEBUG] | [AE2:S/]: | Timing: | machine | tick | at | (1573 | 162 | 438) | took | 219600 | ns, | new | state | is | SLEEP

As you can see the timing is quite high, do interfaces (or AE participants in general) get shut off once tick time is too long? I joined at 15:33 and found this interface "stuck" at 16:15. According to the list it was not "online" until I manually clicked in it at 16:15.

repo-alt commented 2 years ago

No one counts the tick time. Devices do "shut off" if their new state is "SLEEP" like in our case. If an interface goes to sleep, it is woke up when it's inventory changes. When inventory changes interface looks if there are pending requests, then updates work plan and checks if there are pending requests now. If the state is the same, then it does not wake the interface. Theoretically interface should go to sleep only if it has nothing to do, but there is another possible corner case. I'll add some more diagnostics in debug mode

repo-alt commented 2 years ago

(build should appear on jenkins soon)

GTNHAfx237v7 commented 2 years ago

This particular interface handles the drops from my multifarm, once I looked at it I saw that all it’s inventory slots were full. I have 70MB of log, will need to check out more details as well.

GTNH-Afx237v7 commented 2 years ago

So after updating to 2.1.2.3qf this seems to no longer happen. Will keep an eye on it but yesterday everything went fine for several hours.

kilianloftis commented 2 years ago

I've been having the same thing happen, but not just on interfaces; import/export busses, storage busses, and ME drives. It did seems better when I went to 2.1.2.3qf, but has definitely gotten worse lately.

repo-alt commented 2 years ago

It is not the same issue, there is some bug in the last version triggering frequent network rebuilds. I noticed it on quantum bridges on Zeta, but people report it affects other components too. Naturally I can't reproduce anything like that in development environment, but I added some debug logging in the last build, so I hope I'll be able to track it down eventually.

kilianloftis commented 2 years ago

Well, I'm glad I'm not the only one I guess. Is there I Github issue open for it? I couldn't find it.

Is there some way I can get you more information? Or some other way I can help?

repo-alt commented 2 years ago

cf https://github.com/GTNewHorizons/GT-New-Horizons-Modpack/issues/10029