MikaylaFischler / cc-mek-scada

Configurable ComputerCraft SCADA system for multi-reactor control of Mekanism fission reactors with a GUI, automatic safety features, waste processing control, and more! Please be sure to take a look at the Wiki tab, this project has lots of docs!
https://www.youtube.com/@cc-mek-scada
MIT License
343 stars 15 forks source link

RTU (infinitely) Increasing RTT #291

Closed awesomedan153 closed 1 year ago

awesomedan153 commented 1 year ago

For some reason, when I connect my RTU to my Supervisor the RTT steadily climbs no matter what (the highest I've seen it get to is 80000ms). The RTU is connected to the network per the RTU computer and there is a heartbeat. I'm not sure what is causing this issue or how to solve it.

MikaylaFischler commented 1 year ago

That's the second report of this I've had, thanks for letting me know. The RTT is calculated each time a basic ping message is sent, so I'm starting to wonder if other traffic is filling it up on some servers. Can you run pastebin put log.txt on the RTU and on the supervisor and tell me the two links it gives you so I can look at the logs? Can you also let me know how many devices you have on the RTU and what types they are?

MikaylaFischler commented 1 year ago

I see you closed this but since this is the second time I've heard of this I think something is actually wrong, I have a concern that there is a lot of RTU traffic to access the peripherals and it's making the ping packets sit around too long.

awesomedan153 commented 1 year ago

Here is the log file from the RTU: https://pastebin.com/mCa4yj4p

and here is the file from the Supervisor: https://pastebin.com/fZj5Kmdm

MikaylaFischler commented 1 year ago

Can you go into the config of each and set LOG_DEBUG to true, then send me the logs assuming the same thing happens again?

Is this a multiplayer server? If so, do you know what its hosted on (some online service, someones computer, etc) and about how many people are on it?

It's looking like some of the RTU requests are timing out, causing them to be reported as unknown transactions, which matches up with the ping being so poor as well.

awesomedan153 commented 1 year ago

Alright, I've set the LOG_DEBUG to true and am letting it run for a second before getting the log file again.

This server is a multiplayer server that I host off of a service online. I know that the TPS has averages around 19.5 so slightly slower than normal speed but nothing crazy. It's a small server for my friends and I so no more than 2-3 people online at a time.

MikaylaFischler commented 1 year ago

It might be a good to delete the log.txt on both the supervisor and RTU first actually, since it might run out of space and I'll miss the start of the attempt.

Thanks for those details, I'm suspecting the RTU traffic is just too much with 2 boilers, 3 turbines, the environment detector and induction matrix, which I would need to address. Is the rest of your system just the supervisor, coordinator, and one reactor, or are there other RTUs as well?

awesomedan153 commented 1 year ago

log file for Supervisor: https://pastebin.com/1VQbQgZ5

log file for RTU: https://pastebin.com/VUL1sdVB

awesomedan153 commented 1 year ago

Let me know if you want me to re run. It seems that the file has the startup but I can delete the log.txt if needed.

awesomedan153 commented 1 year ago

RIght now it is just the supervisor coordinator reactor. No other RTU.

MikaylaFischler commented 1 year ago

Okay, thanks. Could you try putting the 3 turbines on their own RTU? So then that should be 4 devices on one and 3 on the other. I want to see if the problem is in the RTU code or the supervisor code. If the problem goes away, then at least it now works for you, and that should help me figure this all out quicker.

awesomedan153 commented 1 year ago

Will do and will update shortly

awesomedan153 commented 1 year ago

Looks like that solved the issue, both RTU units are staying in the 150 ms range now.

MikaylaFischler commented 1 year ago

Okay, that's very helpful, thank you. I'll try mirroring your setup and hopefully I'll be able to reproduce the problem. For now, I of course recommend continuing with the two RTUs.

I doubt this will be a quick and easy fix, so it'll take some time before this gets fixed in released code.

awesomedan153 commented 1 year ago

Great, thanks so much

MikaylaFischler commented 1 year ago

Root cause: RTU comms thread fails to keep up with packet storm when it has lots of devices since it yields for 0.05s each time between packets. Tested with 20 SNAs

[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 62
[Tue Jul 18 21:32:28 2023] [INF] supervisor connection established
[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 61
[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 60
[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 59
[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 58
[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 57
[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 56
[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 55
[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 54
[Tue Jul 18 21:32:28 2023] [DBG] comms queue length: 53
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 52
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 71
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 70
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 69
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 68
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 67
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 66
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 65
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 64
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 98
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 97
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 96
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 115
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 114
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 113
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 112
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 111
[Tue Jul 18 21:32:29 2023] [DBG] comms queue length: 110
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 109
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 108
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 107
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 106
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 105
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 124
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 123
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 122
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 121
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 120
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 149
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 148
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 147
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 146
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 145
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 144
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 163
[Tue Jul 18 21:32:30 2023] [DBG] comms queue length: 162
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 161
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 160
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 159
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 158
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 157
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 156
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 155
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 154
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 153
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 172
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 171
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 198
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 197
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 196
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 195
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 194
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 193
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 192
[Tue Jul 18 21:32:31 2023] [DBG] comms queue length: 191
[Tue Jul 18 21:32:32 2023] [DBG] comms queue length: 190

The fix for this was straightforward, I'm glad it was in the first place I decided to check. RTU comms thread will now go through the queue as fast as it can, max 100ms before yielding, just like the sessions on the supervisor. Change will be released shortly as 1.6.2-beta.

[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 1
[Tue Jul 18 21:35:29 2023] [INF] supervisor connection established
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 61
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 60
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 59
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 58
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 57
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 56
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 55
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 54
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 53
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 52
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 51
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 50
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 49
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 48
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 47
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 46
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 45
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 44
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 43
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 42
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 41
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 40
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 39
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 38
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 37
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 36
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 35
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 34
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 33
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 32
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 31
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 30
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 29
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 28
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 27
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 26
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 25
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 24
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 23
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 22
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 21
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 20
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 19
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 18
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 17
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 16
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 15
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 14
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 13
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 12
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 11
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 10
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 9
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 8
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 7
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 6
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 5
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 4
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 3
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 2
[Tue Jul 18 21:35:29 2023] [DBG] comms queue length: 1
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 20
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 19
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 18
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 17
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 16
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 15
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 14
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 13
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 12
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 11
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 10
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 9
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 8
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 7
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 6
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 5
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 4
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 3
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 2
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 1
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 20
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 19
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 18
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 17
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 16
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 15
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 14
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 13
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 12
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 11
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 10
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 9
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 8
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 7
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 6
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 5
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 4
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 3
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 2
[Tue Jul 18 21:35:30 2023] [DBG] comms queue length: 1