Closed thisnickwasfree closed 4 years ago
I performed tests on this scene, I experience opposite effects.
Old Engine: 160 FPS, 180 with fix from other issue New Engine 350 FPS, 380 with fix
Thanks to a different scene builder I was able to nearly cut CPU usage in half (only of the builder). It should never result in fewer FPS.
Please repeat your tests, if you are right I'm hitting a bottleneck not appearing on some systems. Then I will continue my tests on different systems.
Well, I see noticeable fps drop with the new version… But I do not know how to illustrate this. One more thing. Even with high fps new version is not comfortable for eyes. Looks like it has some frame dropping even at 80+ fps. I've checked CPU utilization, video card monitoring, disabling of compositing and found nothing.
That is strange. I will test on different system and look for lag spikes. But at least CPU usage should have been reduced, even if I messed up something different haha.
And if you dont care about any special effects (no bloom, no AO, ...), you can use renderSet:setMode("direct")
to switch to the fastest mode. But I see if I can find the real issue.
You were right. CPU is better, way better. But I hit a GPU bottleneck. Even harder than last version. Well, now since CPU is no problem, I can focus on optimizing GPU.... I'll notify you here once I find something useful.
Looks like something happens every several updates. And at the moment of this event lags moving of camera, models and even a cursor (a picture, taking screen coordinates from real cursor every update) appear.
Hmm, in the background is the job executor. It render shadows, reflections but also the sky base reflection. I tried to make it as smooth as possible, but it may cause lags nevertheless.
Try using dream:present(false, false, true)
instead (which disables the job engine). The scene will look darker because of the missing ambient color. If this solves the problem, you just need to supply an own default reflection cubemap. Or disable it with setReflection(false)
and it uses a constant color, works fine too.
No, that's not about reflections, I use these settings:
set = dream:newSetSettings()
set:setMsaa(0);
set:setFxaa(false);
dream:setFrustumCheck(true);
dream:setAO (false);
dream:setBloom (false);
dream:setShadowCascade(0, 0);
dream:setSunShadow(false);
dream:setDefaultShaderType(false);
dream:setDither(false);
dream:setAutoExposure(false);
dream:setReflection(false);
And dream:present(false, false, true) does not work too (I use dream:present(false, false) before).
Lags appear every 2 or about seconds and looks like they are fps independent.
UP: Or fps drop is too short and love2d can not show it correctly.
BTW, how to disable shadows? At 100%? Is it possible that the engine recalculates then but does not show them in render output? I mean, may be the lags can be connected somehow with shadow cascades, for example? We are moving, cascades switch, getting lag. Or sunshadows? Sun moves, changes it's position, sunshadows recalculate.
Yes LÖVE averages the FPS, I think l.t.getDelta or similar shows it. Shadows should be disabled with your settings, and if not, using present(false, false, true) would have helped. So Im running out of ideas.
The good new is I tracked down some code which some GPUs hate. I mean, really hate. Maybe removing them will help for you too, my laptops FPS increase by 50%. But I'm currently thinking how I can avoid this code since it is important. GPUs are strange.
I'll check this, but I have GeForce GTX 1660 Ti at current system, so that's rather hard to overload this model with something like my project.
wait a second. You get 30 FPS. On a 1660 Ti?
wait a second. You get 30 FPS. On a 1660 Ti?
Sometimes — yes. Well, I could get them at old version too, but with twiced distance of drawing at least. And with new one I can get fps drop from 80 to 30 easily in some situations, where old version shows something like 70 to 50 — 55. That trees in demo3 — 150+ in frame + 200 or even more ground tiles (textured plains) makes rather hard load together. I'll optimize it somehow (at least with switching most of background models with one low-detalized) but now it shows the difference between versions.
Switched off daytime changing in update — lags still presented. Well, searching new variants.
That's why I'm not sure this is GPU's fault. CPU sometime has pikes — 96+ percents (old version has 93 at the same condition as maximum). And looks like most of the time new version loads CPU less.
I am very confused right now.. Your systems is slightly faster than mine and therefore should easily keep a few hundreds FPS.
I am very confused right now.
My project uses map from tiles. 200x200 tile field (every tile 5mx5m) with different objects at 1/4 of tiles. I draw only objects which are closer than 125 metres and use frustumCheck. And there a lot of mobs, MobConditions, TileConditions with their own updates. I try to optimize all this from time to time. But anyway, there can be a lot of bottlenecks in my code and in 3DreamEngine too. Sometimes they can affect each other. I mean, the main question is not performance itself, but changing the performance from version to version in the same conditions. And yes, low fps depends on CPU load, not GPU. I've launched the project at RTX2080Super several times, can not say I've seen the difference.
Oh, ok, that explains at least something. I mean, it still should not get lower FPS, but a lot of objects are always slow, and sadly there is nothing to do.
I'm working on tools to combine meshes on runtime, for LOD generation and similar. I expect a massive FPS boost if you just chunk, lets say 16 by 16 tiles together.
Also, you should use scenes in the future (some things are still not implemented, like frustum). Chunk tiles together in scenes, then draw the scenes. Why? Because if my math is correct, 5050 tiles are 2500 tiles to perform frustum checks and similar on. 1010 scenes are only 100 to check in the first pass, discarding entire scenes without further tests. Also, a scene can be static, removing a bit of the overhead caused by dream:draw(). But the most important thing is mesh combining. I'm working on it.
Well, I've printed delta time from update — lag is visible.
dt: 0.0077151759996923 dt: 0.0073157500000889 dt: 0.0076234320004005 dt: 0.0075302610002836 dt: 0.0074855909997495 dt: 0.0076496889996633 dt: 0.0077351370000542 dt: 0.0070167800004128 dt: 0.0073202289995606 dt: 0.0070874789998925 dt: 0.0072007070002655 dt: 0.0077851949999967 dt: 0.007328509999752 dt: 0.006828647000475 dt: 0.0078392340001301 dt: 0.01623632799965 dt: 0.01347882600021 dt: 0.010310765999748 dt: 0.0097991989996444 dt: 0.015124320000723 dt: 0.010734306999439 dt: 0.010559958000158 dt: 0.0097718690003603 dt: 0.0095303759999297 dt: 0.0096999789993788
And I have an idea now. I have tic counters at main scene and all mobs and mob/tile conditions in the scene. So, some operations can be maded at tic conter (dt *100). Not very much, only most critical like adding recovery time to mobs or changing sun position. But much more calculations I have at tic100 and tic200 events (guess it's clear how I get them). CPU loading becomes higher at tic100/200 events, so if new version of 3DreamEngine highers CPU utilization at the same time, I can get lag… Removing most of mobs has no positive result… Not sure, if the idea is really good, but have no another one atm.
Changing love.timer.sleep(0.001) to love.timer.sleep(0.01) in overloaded love.run() makes lags shorter, but they are still presented with almost the same frequency. One more crazy idea. I see lags in to situation: mobs automatic moving and camera moving with keyboard. Do not know what is it with mobs… But as for camera: can it be connected somehow with the frequency of keypressed event? I'm holding key «w» (forward), love2d checks if it's pressed, then starts moving from tile to tile. When new position is reached it stops, checks the key again and gets nothing. Lag. New check — oh, key pressed we can move again! Can this be affected somehow by changes in the last version?
I'm not sure, but it seems to me some mechanisms in my code were somehow affected. So, unnoticable lags in ms has become lags half-second after update.
One more up: optimizing of trees (1/3 of polygons left, same view) reduced fps drop from ~30 to 41 — 42 at few places, usually the lowest value is near 50+ fps. I guess that removing most of them with one big forest model will make fps higher too. But lags looks independent from fps. Checking my code for bottlnecks again.
All synthetic-demo-tests shows the new version is faster… But I see lags in game. One more thing. Picture differs when rendered by different versions. Old engine had dream.lighting_engine = "Phong" option. What is the default of the new one? The same question about dream.alphaBlendMode.
UP: I try to draw about 1836 tiles per update (let alone additional objects at them), checkFrustum removes 3/4 of them. So… Looks like it's really close to limit of draw calls for the engine. Well, my project needs some rework and optimization anyway, but was frustumCheck somehow changed with last updates?
lighting_engine has been merged with defaultShader, and the default is Phong. Phong is fastest. alphaBlendMode has been removed and is now per material (dither toggle and of course the old alpha tag) or optional the average alpha, which is a slower approximation of several layers. So, mostly uninteresting.
frustumCheck should work the same I think. But yes, its the 1.8/4k objects which are actually hitting a LÖVE bottleneck (in the end, I still use classic l.g.draw()). Those need to be merged. Or at least instanced. Both options in progress.
A gpu can easily render a 30k polygon mesh. But 3000 smaller 30 polygon meshes kills it.
Interesting results. Without vsync new version has higher fps than old one (new — I mean published, after long pause). But with vsync old version has ~60 fps (sometimes 57), but new one — 53 — 48. And new one has alpha only for 3 materials, not for all as old one. And anyway, old version runs much smoother.
UP: And all demos shows 2x higher fps with new version…
https://ibb.co/xX15ZHy https://ibb.co/MyzN0Xd
New engine and old one. Well, pictures are not 100% identical, cause I changed sun's offset in new scene, but… Is it noticeable any difference in technologies between them? Is it possible I have something on in new scene and off in old one?
I'm not sure about the question. A few things have changed, but none of them should affects this scene.
I was able to find and improve slow code, especially CPU usage and GPU overhead should have been reduced. I hope I did not break anything.
Situation looks much better. Like before big update but with higher fps.
Usually it has fps close to previous version, may be a bit higher some times. But, for example, near some dozens of trees (demo with models uploaded, links at the issue with wind shader) it drops fps to ~30. And the old version has 50 — 60 in the same conditions.