Closed ivansrbulov closed 1 day ago
Hey Ivan thanks for the report. I'll look into it. Looks like the SQL delete query may be suspicious as well.
On Sat, Nov 9, 2024, 3:14 PM Ivan Srbulov @.***> wrote:
Using sysstat's sar I have been able to plot usage of RAM of this tool on my Ubuntu server dedicated to only hosting this dashboard. As can be seen in the chart below, progressively more RAM is used until the server eventually becomes unresponsive and requires a restart: Figure.2024-11-09.230202.png (view on web) https://github.com/user-attachments/assets/513a18b2-f1d5-4ca0-b1a0-ce936e8d1240
From 3pm to 11pm local time, system RAM utilisation has gone up from 51% to 71.5%
At this level of utilisation, I have also noticed weird things starting to happen on the dashboard. Trains are particularly affected, the charts showing travel times are pretty much broken, showing every journey and station travel time except two as 0 minutes: image.png (view on web) https://github.com/user-attachments/assets/9958a9cc-23d2-4cf8-a653-e37e1de12139
The only processes ongoing on the server are this dashboard, sar, and nginx as I am using it only to redirect port 80 to port 3000 which utilises less than 1% of RAM combined. Screenshot of server's htop which shows the significant usage: image.png (view on web) https://github.com/user-attachments/assets/ebc1cd3f-a0cc-4101-a197-f7827c5df534
— Reply to this email directly, view it on GitHub https://github.com/featheredtoast/satisfactory-monitoring/issues/4, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKC4JTOKIDWGKX65PDXM53Z72JNBAVCNFSM6AAAAABRPSGGHOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGY2DMNRUGA3DKOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
No problem at all, thanks for looking into it! Let me know if there's anything I can do to help. And yes, well spotted on the DELETE process.
Hi there, I've inspected it and looks like there was some unclosed http requests that are now handled and I'm now getting no sawtooth on the golang apps - Can you pull the latest try again?
Have done, will come back in a couple of hours with the same info as above to see how things are looking!
Nearly after 12 hours of running, it seems the fixes have worked for RAM:
But it also seems that the delete query continues to possibly be an issue? It's not possible to see this from this screenshot, but either cores seem to be at 100% utilisation, alternating between them, or some combination with the delete query always being top.
If helpful I can reset and track CPU usage.
wow thanks for extensively confirming the memory fixes at least - I'll see if I can't dig to the bottom of the big delete query, but any little bit of hints helps here. Nothing obvious jumps out to me so far, but I'll check some more
No problem, was very easy to do!
On the CPU, something very weird is happening. The chart tracks the CPU% utilisation for the top 10 processes.
The Grafana spikes are when I access the dashboard, so that is not concerning / surprising.
htop
:
Full .log in case you want to see it yourself. Goes crazy at 13:30. cpu_usage_top10.log
alrighty, I've dug into the queries, and believe I found the issue, was doing a dumb method to truncate history metrics - would you mind giving the latest another go?
Sidenote, I'm also now building docker images, so you may have to also run docker compose pull
in addition to docker compose down
to get the latest.
Thanks for the sidenote, I followed that process and it looks like everything works so I'll close this. Thanks for a quick fix!
Using
sysstat
'ssar
I have been able to plot usage of RAM of this tool on my Ubuntu server dedicated to only hosting this dashboard. As can be seen in the chart below, progressively more RAM is used until the server eventually becomes unresponsive and requires a restart:From 3pm to 11pm local time, system RAM utilisation has gone up from 51% to 71.5%
At this level of utilisation, I have also noticed weird things starting to happen on the dashboard. Trains are particularly affected, the charts showing travel times are pretty much broken, showing every journey and station travel time except two as 0 minutes:
The only processes ongoing on the server are this dashboard,
sar
, andnginx
as I am using it only to redirect port 80 to port 3000 which utilises less than 1% of RAM combined. Screenshot of server'shtop
which shows the significant usage: