MightyPirates / OpenComputers

Home of the OpenComputers mod for Minecraft.
https://oc.cil.li
Other
1.59k stars 431 forks source link

Hologram rendering performance #259

Closed evg-zhabotinsky closed 10 years ago

evg-zhabotinsky commented 10 years ago

Relevant part of initial message (after testing):

I use version 1.2.9.362 for MC 1.6.4 Minecraft freezes for a moment when you are looking at hologram and its part is updated.

More testing

It turned out that hologram updates cripple performance less on my integrated video (intel hd 4600) With all minimum settings it runs 80 fps normally, 24 when I see random hologram and 15-20 when randomly updating

On discrete video (nvidia gt 730m) with all minimum settings it runs 300 fps normally, same with empty hologram, 120 fps when I see fully filled hologram, 25 (!) fps when I see randomly filled hologram and 5 (!!!) fps when I it updates randomly

Amount of pixels covered by hologram does not affect fps so it has nothing to do with transparency, etc.

Also I took a look at render code and it seems to me that update lags are more noticable due to render list recompilation and. Fps show that it also lags when not updating (although not so notably).

Fps changes and code suggest that amount of rendered quads (borders between 'on' and 'off' voxels) is what affects performance most, but my GPU should be able to render much more than 5 millions triangles a second (~200000 triangles at ~25 fps) (Minecraft renders at least 10 times more!) (again, it has nothing to do with their area, only amount).

I do not know what can be done. Maybe I just should not draw random holograms :)

fnuecke commented 10 years ago

Yeah, depending on the complexity of the geometry you make the projector display the framerate can drop quite badly. The worst-case is pretty much 50% 3D checkerboard fill. I do plan to invest some time into optimizing this, but it's a complex topic, and time is the one thing that there's never enough of. Some ideas I have are

Will have to see how much can be 'optimized away' before the optimization gets more expensive than the actual rendering, though...

If someone who has more experience with stuff like this reads this, some contributions - and if it's just ideas and pointers - would be very much appreciated.

KFAFSP commented 10 years ago

Just for further reference: I would suggest (I can't do it in scala myself) to implement a QuadTree. That would make the culling irrelevant, since the hologram would be chunked up to drawing ownly the biggest possible cubes. However, this will not improve performance when rendering a checkerboard pattern, but internal culling wouldn't either.

Some kind of walkaround: I am currently "computing" the blocks that do not have to be rendered on the controlling OC-Machine, as it is being executed in a different thread than the minecraft renderer. Helps improving render tick time whilst slowing down the hologram update process.

evg-zhabotinsky commented 10 years ago

Just an idea. Maybe it would be better to implement hologram using triangles rather than voxels? When drawing hologram as it is now it takes more than 10000 quads to render just a fully filled hologram. More than enough to construct any meaningful image if you can draw any triangles. In this case performance problem can be fixed by limiting amount of triangles and traffic usage can be limited by amount of changes that can be made each second. To allow smooth modifications double buffering can be implemented. (Or multi-buffering to store more frames) As a way to maintain "cubicity" of Minecraft vertex coordinates can be limited to multiples of 1/16 or 1/32 (definitely configurable) of hologram size. It might be dependent on projector tier (as available colors). Also I think it would be good to implement transparency in this case (rendering all the triangles and not only frontmost ones)

evg-zhabotinsky commented 10 years ago

By the way, the above is also more realistic than voxel-based hologram. Of course you can build LED grid to display images but it is extremely expensive and hardly scalable at all and LEDs need to be transparent enough if you put them really close. You also need to feed data into such a display at epic rates. Just imagine! A small 256x256x266 colorful cube updated at 60Hz will require multiplexing analogue signals (as in VGA) at 1GHz frequency! This means 8 gigabits per second bandwidth just for early VGA image quality (it could draw 320x200 at 256 colors and it was not any good without picking palette properly and preparing images carefully) On the other side, you can just make a cube (of any size!) that diffuses light well (but not too well) and focus light on any point inside it. It will make that point glow. You can move the glowing point in parallel lines to scan the whole cube and build raster image (and have all the above problems, except LED prices), but you can also move it in arbitrary pattern and draw lines (and some kind of "surfaces"), which will require much less bandwidth and will allow drawing images of any size without dramatically increasing bandwidth (as cube of its linear size) to just make it look like image and not a set of separate dots. By the way, your mod also has exactly the same bandwidth problem :D

fnuecke commented 10 years ago

Uhm, you kind of lost me there :-P Either way, the bandwidth is actually comparatively reasonable right now, since not everything is sent each for each update, and the refresh rate is relatively low. The main issue is the rendering performance, really. And I'm planning on keeping it as cubes, since it just makes sense in Minecraft, and it's much easier to get into.

evg-zhabotinsky commented 10 years ago

I meant rendering bandwidth in fact)) Back to cubes :) I have rewritten rendering code a few times in C (I am not too familiar with Java, let alone Scala) It turned out that display lists are approximately 10% slower than just glBegin/glEnd, at least if you don't use Java :) I also found out that storing all vertices (with normals and UV) in a buffer (common for all holograms) and rendering them with glMultiDrawElements yields more than 10x speedup. Now I have problems converting it to Scala. I am stuck with this FloatBuffer that should be filled with coordinates. It just does something I do not yet understand and glMultiDrawElements fails with "Out of memory" OpenGL error.

fnuecke commented 10 years ago

Aahh, I see, my bad.

From what I've seen display lists are a lot faster than direct mode - at least for the text on the screens that is certainly the case. Probably depends on the scenario.

I did mess around with VBOs a bit, too, and also got that out of memory error, even though the buffer was tiny (4 verts). I'm pretty sure it's some interference with MC's tessellator or so (since that also uses VBOs I think). I couldn't figure out how to return the OGL state to the way MC needs it to be, so I put my focus elsewhere for the time being. If you do figure it out, that's help immensely.

evg-zhabotinsky commented 10 years ago

Well, I forgot to say that I have quite modern hardware. And modern hardware tends to implement deprecated functionality quite slow way. All the fixed-function pipeline, including glBegin/glEnd and display lists, thick lines and many other things are deprecated for more than 5 years. glDrawElements is faster in general (especially in Java where calling glVertex() is rather slow) but it also works much like how things are rendered with OpenGL 3.X and 4.X. Said that, speedup on older machines may not be as huge as on mine. I was just impressed by more than 10x speedup and forgot about that. Display lists should also work faster than not using them on computers that are at least 2-3 years old. (Or if you use e.g. Java since they reduce amount of required API calls greatly)

The main problem on my machine is that recompiling display list for checkerboard hologram freezes video for more than 3 screen updates, that is, FPS drops below 15 for a moment, which is quite unpleasant. Overall FPS is still good enough.

Anyway, there are A LOT of ways to make Minecraft server or nearby client lag if you really want to. No meaningful hologram should cause lag. Only that it is trivial to create lagging one. Server owner might just ban players who willingly create lagging holograms))

LordFokas commented 10 years ago

The thing with display lists is that there needs to be a balance between list size and amount of lists. A lot of small lists can potentially be as bad as extensive use of directly drawing polygons.

evg-zhabotinsky commented 10 years ago

Well, I decided to implement a hack to avoid rendering totally invisible quads. Then I remembered that vertical columns in a checkerboard pattern also cause lag while all the rendered quads are really visible. Quad merging is now impossible in general case because you are implementing color holograms. Said that, the only remaining way to avoid lagging at all is to render all this quickly. I might look into using shaders to render it faster, when I will have some free time.

fnuecke commented 10 years ago

If you do manage some breakthrough, I'd be very happy for the assistance here :-) I'll probably look into splitting up the list into several lists first, that seems to be the best tweak w.r.t. effort vs. gain. Perhaps this weekend, perhaps later. Depends on how annoyed I get with debugging the native lib...

Wuerfel21 commented 10 years ago

Uhh, would removing the texture(aka making it 1x1) make it faster? A 8x8 texture pack helped me with normal minecraft on a netbook :-)

Wuerfel21 commented 10 years ago

AFAIK one cube takes 12 polygons, a modern game has millions, come on!

jeffreykog commented 10 years ago

That's how minecraft works. Minecraft does all the calculations of those polys and verticies on the cpu instead of on the gpu, what other games do

evg-zhabotinsky commented 10 years ago

Ignore the above commit. For some reason it refused to work on my Intel GPU. Need to test.

evg-zhabotinsky commented 10 years ago

Well, here it is. FPS on my Intel HD 4600 is good (epic compared to upstream). Any testing appreciated. However, commit d99bc5815c680198dfa7cd94eede40e5b4e66296 makes all holograms synchronously update every 2 seconds, which in turn causes freezes. (these freezes are much more severe on upstream) this problem is solved