projecthorus / sondehub-tracker

🎈 Frontend for SondeHub Radiosonde Tracking
https://v2.sondehub.org
MIT License
60 stars 25 forks source link

websocket error #263

Closed 14ri004 closed 1 year ago

14ri004 commented 1 year ago

Sondehub crash for several days = websocket error

14ri004 commented 1 year ago

ok mark, I just cleared the firefox cache,

current data core2 duo

Capture

darksidelemm commented 1 year ago

Yet more changes made to try and speed things up further...

TheSkorm commented 1 year ago

Testing version: image

Prod version: image

CPU usage about 20% lower than prod as well according to task manager.

I was able to kind of replicate the data error issue on prod by using 1 core 4GB machine running in a VM then having the tab suspended - waiting for a large number of messages to buffer and then unsuspending the tab. On unsuspend it tried to catch up but never did. Screen Shot 2022-10-06 at 10 48 07 pm Screen Shot 2022-10-06 at 10 37 35 pm

These are probably the last set of changes we can make to SondeHub Tracker to improve performance without doing a full rewrite of how the objects are loaded (which none of the team have time to do).

TheSkorm commented 1 year ago

I think I found a none tracker issue that might be the cause. I'm investigating

14ri004 commented 1 year ago

it worked very well between 1:00 p.m. and 2:30 p.m., perfect reception, smooth, and fast updating, but from 2:30 p.m. misery, websocket 30 msg/s, random updating, page frozen for 1 to 2 minutes, data arrives in jerks every 1mn see more, I have the same problem on my i5, and with other people exactly the same and at the same time, I checked the firefox task manager and the pc seems to be running correctly, practically identical on my i5, I do not understand what is happening, it is always around the same time that the reception problems appear

Capture2

14ri004 commented 1 year ago

strangely around 3:00 p.m. everything works fine again, normal decoding, correct update, websocket 30 msg/s, no more random blocking, for the moment it's perfect

for 40 minutes perfect reception, fluid at my place on the 3 PCs, friends the same no worries, to see tomorrow if the problem is solved, thank you for your patience and help

TheSkorm commented 1 year ago

This issue looks like it caused by our MQTT read instances (what the tracker connects to) being overwhelmed and running out of CPU credits. image. When it runs out of credits it appears to de dropping connections, or pausing packet delivery to certain clients. image

Since there are multiple servers if the websockets disconnect or drop it might connect to another instance which is working (until that server is overwhelmed)

At around 12:50 UTC I doubled the number of servers and this will hopefully fix the issue. This however comes at a cost so please make sure y'all are signed up on Patreon or monthly PayPal donations.

I think this has fixed the issue, however it's hard for me to test as the issue only occurs at midnight our time and we need to work in the morning :)

@darksidelemm I still think its worth rolling out the UI speed improvements - those have certainly improved the speed of the UI in a benchmark-able and demonstrable way. This will drop support for any clients that didn't have websocket support - but I think this is ok

14ri004 commented 1 year ago

hello, indeed for 2 days the system has been working perfectly again, no more slowdowns, perfect data updating, no more blockages or frozen pages, we are back like 1 month ago, good job, good weekend, thank you.