Closed magicus closed 2 years ago
Night-config internals scares me more and more. Guess I'll cache the values then.
I've drilled a bit further down in the JFR analysis. hasEntitiesLightSource()
(and the block variant) comes from getLuminanceFrom()
. The config check here seems to be the root of the problems; I'm guessing it should be cached as a boolean in DynamicLightHandlers
.
It should be cached in the config class itself, since the dynamic light handlers are not the only source of calls.
Fair enough, you know the code best. :)
FWIW, I've found no more performance outliers from LambDynamicLights. The next top CPU hog is getDynamicLightLevel(), but that's kind of expected (messing with lightning is sure to get you on the top performance affecting mods :-)), and that method seems hard to optimize further. Caching light levels for positions is probably not worth the hassle. Overall, we're down to about 0.3% so Amdahl's says I should probably give up my analysis now. :)
(I could not stay away from the analysis... :-) LambDynamicLights is still the only mod showing up as affecting performance as I work myself down the list.)
A second performance problem which creeps up in several places is the ConcurrentLinkedQueue dynamicLightSources
. The iteration of this queue (ConcurrentLinkedQueue$Itr.next()
) seem to be responsible for about 2/3 of the execution time of getDynamicLightLevel
(so much for my assessment that it could not be optimized further...).
Also, when adding new light sources, the entire queue must be traversed to check if the source is already present, causing ConcurrentLinkedQueue.contains
to show up as a hot method. It might be that the list is typically long, and should be splitted per chunk/region/whatever. Or just that a queue is not an optimal data structure for checking if a value is already present. A hashmap might be more efficient. Since only the max level is extracted, I can't see that ordering matters, so iterating over a hashmap in getDynamicLightLevel
would probably be alright as well.
Also, it might be worth looking over the concurrency requirements here, since having a concurrency-safe data structure in a hot code path is sure to cause extra penalties. For instance, if it is unlikely that this structure is added to, but often iterated over, a reader/writer style protection, or a classic java synchronized lock might be a better fit. Otoh, if getDynamicLightLevel
is called from multiple threads, this is a no-go.
Let me know if you want me to submit a PR with a fix. (I suspect a developer familiar with the code can make it quicker than I, but I'm willing to help if there is not enough resources on the project.)
Very sorry but while reading I totally missed this part, feel free to PR, I'm not currently much available so any help is appreciated.
@LambdAurora I don't have that much time myself :) but I have now submitted three PRs with different performance optimizations. The net result of applying all of them is that LambDynamicsLight hardly show up in the JFR method profiling, from being quite prominent, which I think is quite nice.
By its very nature, LambDynamicsLight is always going to be expensive, but now it is at least not more expensive than necessary.
Fixed by commit 2678e56a2fc63531f2d9d9f59dfafdab7217f88e.
I was analysing my Minecraft setup with JFR to find mods that were misbehaving. One of the hottest methods from outside Vanilla came from LambDynamicLights. (In the JFR recording, I just stood still and looked around my base.)
The hot method is Arrays.copyOf(Object[], int) which is responsible for > 1% of all CPU cycles. The stack trace leading to this is:
which is called from
dev.lambdaurora.lambdynlights.DynamicLightsConfig.hasEntitiesLightSource()
anddev.lambdaurora.lambdynlights.DynamicLightsConfig.hasBlockEntitiesLightSource()
, respectively.I have not dived into the the source code yet, but by the name of the class, it sounds like getting an "Unmodifiable Config" should not really have to do a string split (which ends up doing array copies) each time it is called. I imagine there is some easy room for improvement here.
Let me know if you want me to submit a PR with a fix. (I suspect a developer familiar with the code can make it quicker than I, but I'm willing to help if there is not enough resources on the project.)