Open joelwurtz opened 1 year ago
It might be a lot more useful to provide ETW traces when these issues are happening. This can provide Cfx with a lot more detail of whats going on between each tick
I have a done a trace but it does not show much information about that, i'm willing to share it but does it gave away the memory also ? (If yes i idon't want to share it publicly then)
I'm wondering also, if i'm not wrong, player are subscribed / unsubcribed from those bags when they entter / leave the culling zone of an entity ? Maybe having a lot of players would then stress a lot the Add / Remove target to the state bag, if that's the case then my pr should definitely help
but does it gave away the memory also
no
Here is the profile https://www.swisstransfer.com/d/b38b42c5-15f7-4fb7-9bc7-caba244d19e2
During this profile there was a lag between 10 to 50ms when fetching the state bag
Sorry to butt in, but what tool is this? It seems to be Grafana but how are you getting performance data to it?
Sorry to butt in, but what tool is this? It seems to be Grafana but how are you getting performance data to it?
We indeed use some of those metrics, but will also wrap every event handler around our own function that measure it's performance, All our code is open source, you can see that here : https://github.com/SOZ-Faut-etre-Sub/SOZ-FiveM-Server (the metrics part is in our "core" module at https://github.com/SOZ-Faut-etre-Sub/SOZ-FiveM-Server/tree/oss/resources/%5Bsoz%5D/soz-core)
We are adding metrics on our server to detect and tackle performance on some of our scripts, and we where having a strange performance lag on one part of our code where the code is only doing something like :
Where there is a "low" player count (between 1 to 80 player), this code run normally under 5ms, however when there is more player it can go up to more than 100ms (even more than 250ms in one or two tick),
See the attached graph, (first one is the tick duration, second one is our player count)
At first we were believing there was some problems when setting the value to the callback since it will lock the data and need to update other players.
So we add a metrics around this call to mesure it
As you can see, the set in this call was always under 5 ms so this was not it, after looking more at the code and others events, we where having another tick that have the same problem, however this tick was only getting a value from an Entity state bag (and do nothing else except calculating some data and keeping it in memory)
As you can see both tick have (at some details) the exact same time in duration, after adding some more metrics around the
GET_STATE_BAG_VALUE
native we see that it was this native that was inducing the lag :When looking at the code in fivem, there is a Mutex when getting a state bag see https://github.com/citizenfx/fivem/blob/master/code/components/citizen-resources-core/src/StateBagComponent.cpp#L473
Since this problemes appear on the first time that we actually need a state bag and not after, we strongly believe that there is a lot of write pressure on this lock so it's is slow to release (maybe from another thread ?) when having a large player count.
At one point we were thinking that, since we have more people we may have more entity created (which create a state bag every time if i'm not wrong) but this was not the case as you can see here :
There seems to be no correlation between both events
Looking more at the code we also see there is a lot of time this mutex is write locked. Even when not updating the map, like here https://github.com/citizenfx/fivem/blob/master/code/components/citizen-resources-core/src/StateBagComponent.cpp#L484 (since it update only the targets i believe we can safely use another mutex here to relieve some pressure)
But i look at every place this mutex is lock and i don't see why there is this much pressure.
Our server is on 5848 build but after looking at the diff (i know there was some update on state bag some months ago), i don't think this will be resolved if we update.
Also all our graph are done on server side, i do believe there is the same problem in client but since it handles less state bag it should not appear so often.
Do you have more insights on this ?