matiasah / shadows

Shädows - A Shadows & Lights engine for löve
MIT License
169 stars 11 forks source link

Merge-sort and draw #20

Closed matiasah closed 7 years ago

matiasah commented 7 years ago

Since every object contains 'Z' attribute, I should possibly implement a table that contains all the bodies sorted by their 'Z' component.

However, I should do this every time a body changes it's 'Z' component, or every time a body is added or removed from the light world so that this table doesn't need to be generated too often.

Merge-sort would probably come in handy for this operation as it's needed to generate a output with a new table. Afterwards I would only have to iterate this table once to draw all the objects.

Skeletonxf commented 7 years ago

I had in fact coded up working examples of several sorting algorithms into lua which I could make available to the project. I did not code merge sort, however I've got heap sort and quick sort. It could also be worth looking into using love.thread with the merge sort if more performance is needed.

matiasah commented 7 years ago

Lua by default implements quicksort on table.sort, but I wanted to use merge-sort because it outputs a new array with all the elements.

However I'm not completely sure how I'm going to do this, or how I'm going to handle the changes. This is the part of the code where I'm enforced to sort everything based on how many different values for 'Z' there are. https://github.com/matiasah/shadows/blob/master/Light.lua#L131

Skeletonxf commented 7 years ago

If you want a new table you could just make a copy and then run table.sort on the copy, assuming a shallow copy is good enough. If you're running the sort every frame though presumably not much is going to get unsorted so an algorithm that is stable and runs quickly on nearly sorted arrays would be ideal, which would imply quicksort might not be well suited.

I'm not sure from the code there which table needs sorting.

I wonder if using a binary tree might also be something to consider, you could focus on maintaining the ordering in the tree on changes to the World and then simply read off the elements in sorted order when it comes to drawing. I did actually start coding a AVL tree in lua but I stopped before I finished.

matiasah commented 7 years ago

A binary tree would probably do the job best, however, it'd be better to get every generated link from the tree insertion so that I can maintain a O(1) element deletion.

I'm not aware of what the differences are between the AVL tree and the RB tree.

Skeletonxf commented 7 years ago

I think AVL trees are simpler than Red Black trees, and hence I tried implementing the former rather than the latter.

"however, it'd be better to get every generated link from the tree insertion" - I'm not sure what you mean here?

matiasah commented 7 years ago

I probably meant to get the 'Node' generated when you insert a value into the tree, and being able to do direct removal with Node:Remove().

So if you want to remove a body from the bodies binary tree, you just access the self.Node attribute and call the remove method.

matiasah commented 7 years ago

I coded a RBTree and it seems to work fine, however, I'm starting to consider that it's probably better to use a priority queue instead because the main idea was to be able to iterate from the body with the lowest 'Z' position to the body with the highest 'Z' position (as long as this position is lower than the current layer).

And I don't know what's the best way to iterate over a binary tree from it's lowest value.

Skeletonxf commented 7 years ago
local function inOrder(node)
    if node then
        inOrder(node.l)
        print(node)
        inOrder(node.r)
    end
end

This has worked for me. The only other thing you need is to go up the tree to the root before starting or explicitly save it to start from.

matiasah commented 7 years ago

I have been evaluating what are the drawbacks of using other structures than tables. I established the physics bodies userdata as a key for storing bodies on the bodies table, but if I use a priority queue or a binary tree I'm not going to have a O(1) access, so I won't be able to know if a physics body still exists efficiently, while having all the bodies sorted by height.

This would mean removing the physics world support or just not changing the light engine to optimize this feature at all.

Skeletonxf commented 7 years ago

I am currently using the physics world support so removing it would be bad news for me.

Additionally, box2d does not delete bodies without the programmer making the API calls to do it. If you need to know in constant time if bodies have been deleted or created, you could create a callback that shadow's API would document needed to be called explicitly by the code using the two libraries whenever a physics body is deleted, and you would then know if a physics body exists without having to do lots of expensive queries to establish if it did or not.

matiasah commented 7 years ago

Now I thought of having a priority queue as a second option to containing the bodies, however, if the transform of a body changes it's 'Z' it's going to be a problem because the priority queue is going to be unsorted again, thus I would need to rely on Body.Transform.HasChanged (which doesn't state what exactly changed, just that there was a change in the transform).

Therefore, too many triggers to remove/insert on the priority queue is going to be inefficient anyway. I guess it's probably better to leave the code as it is.

Skeletonxf commented 7 years ago

Having updated my game to the latest version of shadows today I've dropped fps a little and I don't have much left before it falls to sub 60. I don't want to have to drop this library.

Looking through some of the code it seems there's a few areas for improving performance related to having things stored in a sorted form. Firstly, the physics bodies are stored in the LightWorld.Bodies table using their userdata as the key

kikito's inspect library on some of the table

shadows/LightWorld.lua:210: <1>{
[<userdata 12>] = <16>{
    Body = <userdata 12>,
    ID = <userdata 12>,
    Shapes = {},
    Transform = {
      Children = {},
      HasChanged = true,
      InverseMatrix = { { 1, 0 }, { 0, 1 } },
      Matrix = { { 1, 0 }, { 0, 1 } },
      Object = <table 16>,
      Radians = 0,
      Rotation = 0,
      x = 0,
      y = 0,
      z = 1,
      <metatable> = <table 3>
    },
    World = <table 4>,
    <metatable> = <table 15>
  },
  [<userdata 13>] = <17>{
    Body = <userdata 13>,
    ID = <userdata 13>,
    Shapes = {},
    Transform = {
      Children = {},
      HasChanged = true,
      InverseMatrix = { { 1, 0 }, { 0, 1 } },
      Matrix = { { 1, 0 }, { 0, 1 } },
      Object = <table 17>,
      Radians = 0,
      Rotation = 0,
      x = 0,
      y = 0,
      z = 1,
      <metatable> = <table 3>
    },
    World = <table 4>,
    <metatable> = <table 15>
  },
... ect
}

This means you basically have to iterate over them using pairs() which can't guarantee an order of access. To even be able to sort them you are going to have to store all bodies in numerical indexes in the table. If you do this and save the amount of bodies in the LightWorld into a variable to track the length of the self.Bodies table you can replace a lot of pairs() loops with for bodies, length do which is significantly faster https://springrts.com/wiki/Lua_Performance#TEST_9:_for-loops and this speed boost should hopefully be low hanging fruit, and would be a step to making sorted drawing easier to implement.

I think it would be worth implementing a rough draft of merge sorting the bodies every time something HasChanged or even just every frame and seeing if the performance of the library doesn't drop given the boost you'd get in having simpler and faster drawing code. If it drops very slightly or not noticeably then you have the worst case scenario before more complicated optimizations could be made towards intelligent sorting and would know how effective it is likely to be.

Edit: I had a go at coding MergeSort and lua's table.sort was faster by a factor of 10 anyway so I think speed wise table.sort is the way to go.

matiasah commented 7 years ago

In that case I could just use a priority queue but we'd have to do the sacrifice said above (which is having callbacks for physics objects added and removed).

Otherwise we're going to have a O(N*LogN) test on every tick to see if every physics object still exists. (N objets multiplied by the LogN binary search(s) performed in that tick).

To re-sort the priority queue it would be best to use insertion sort as the objects contained will rarely get unsorted (only when a object changes it's Z attribute).

Body creation and removal would be O(LogN).

Skeletonxf commented 7 years ago

I agree with insertion sort, and callbacks would be fine in my use case. I could also volunteer a basic physics demo I've already made using windfield with the call backs in use (it seems the one that existed before is no longer up?)

I would expect most use cases of this library would not do a lot of adding and removing bodies every frame. All the demos I've seen on here either make them all on load or add a new thing on mouse click, so I think o(log(n)) would be acceptable.

matiasah commented 7 years ago

On the other hand, adding callbacks for physics objects addition/removal is pretty much the same as manually creating a body. Except that physics bodies automatically detect the used shapes, it may be somehow better to add a method that sets a Body's physics object to a physics body. What do you think about this?

matiasah commented 7 years ago

By the way I've also been considering that the framerate drops you're getting may be much related to the graphics calls (or shaders) than the Lua (CPU part) code, I have a GTX 650 Ti graphics card and even though it's a 2D application it still makes my GPU increase it's temperature a bit more than normal.

Skeletonxf commented 7 years ago

Do you mean creating and deleting Bodies 'manually' and then setting their physics to the physics body so they update automatically? That would be fine for me.

It's certainly possible. However since I started using this library, dropping the quality on the love.graphics.newShader code down to 1 everywhere makes no noticeable difference. I would expect some change but if there is one I can't tell just by looking at my fps.

matiasah commented 7 years ago

Yes, I do mean that.

About the shaders, if it's not the shaders what's causing this, it's probably the new effect I implemented on the library. Where if you have several bodies with different 'Z', it's going to attempt to draw the shadow of taller bodies on top of the lower bodies.

Let's say there are 'X' different values for the height of the bodies (z), and there are 'N' bodies with 'L' lights.

When a light detects a change, it draws the bodies N X times. Considering the lights factor, if all the lights suddenly detected a change, this would mean drawing the bodies N X * L times.

Having a priority queue with the bodies sorted would probably make the 'N' a few times lower, so that this effect has to compute less.

I could also make this effect being possible to deactivate so that you would only do N draws instead of N * X.

matiasah commented 7 years ago

Anyway, could you try to see how the performance of the library works if you disable the blur shader? The shader implemented here is now radial, the old used to be a normal verical/horizontal blur shader.

Skeletonxf commented 7 years ago

I have tried commenting out lines 225 to 235 in Light.lua to disable the radial blur. The footage is with vsync off to use fps as a rough measure of performance, using my main menu so very little is different between tests. In order is shaders turned off, shaders turned on and then shaders turned on with that bit of code commented out.

https://www.youtube.com/watch?v=IY5Ichh-tu0&

Very roughly around 350 fps with shaders off, then drops to around 90 with shaders on, regardless if I comment out that section (after the blank out). I think that is the section you mean? It seems the blur shader is negligible?

And here is 22 days ago version of shadows at around 100 fps with shaders on. There is no OutputShadow file here if that helps pinpoint anything.

https://www.youtube.com/watch?v=YeURtJqPkws

matiasah commented 7 years ago

Hm, commenting line 231 and 232 is a bad idea, that's the canvas that contains all the shadow textures.

At least we now know that it wasn't caused due to the shaders.

OutputShadow is a class invented to ease the management of the output produced by the shadow shapes.

But I still think that the main cause of the problems is the effect that makes taller bodies cover lower bodies with their shadow, I could modify the library to make this effect deactivable.

matiasah commented 7 years ago

Alright, so I just implemented the priority queue but it still doesn't make use of the sorted set to improve some calculations, but at least I stopped relying on 'pairs' and I'm now using for i= loops.

I could perhaps use this to work with lights/rooms and other objects.

The LightWorld class doesn't have SetPhysics anymore but it now has a InitFromPhysics method.

Bodies still handle physics object destruction automatically though, and I added a Body:SetPhysics method.

Skeletonxf commented 7 years ago

The framerate is basically unchanged :/

I had to comment out lines 178 to 195 in Body.lua, it seems there was already those functions declared above anyway and the ones at the top of the file worked.

Seeing as nothing seems to be making any noticeable difference I downloaded the first lua profiler I could find and ran it on a frame of my project on my main menu

https://pastebin.com/9XFLwtNM

It's a little hard to see until you paste it into a text editor and make the window very wide.

matiasah commented 7 years ago

Alright, I used the same profiler you used to improve the performance and I get a 10 ms improvement in my computer (which is for the first frame).

Could you try the library again? I could've also broken something.

Skeletonxf commented 7 years ago

It's possible that it's marginally faster, fps is not very good as a measure when it varies anyway :/

I've tried again, I still had to remove the 'typeOf' check in Body:SetPhysics

matiasah commented 7 years ago

Alright, so I got my first guess. The process that takes the most time consuming task is the calculation of the shadow shapes (especially for the polygon shape which on my side is consuming 79 milliseconds, that's more than 50% of the calculation time).

The light engine used to remember this shapes so that it didn't have to calculate them too often. However, I had to remove this feature because several shadows were to be casted from one body depending on the layer, so the light sources wouldn't be able to remember which shadow belongs to which layer.

(If one body moves, the shadows of the rest of the bodies wouldn't need to be recalculated).

I would re-implement this feature on the code if I knew how not to break the multiple layers feature.

Skeletonxf commented 7 years ago

I'm not really sure what a layer is. Are they discrete?

At least we've found what seems to be the main bottleneck now.

matiasah commented 7 years ago

A layer is just generated in a while loop, see https://github.com/matiasah/shadows/blob/master/Light.lua#L143.

The priority queue is used to calculate every layer efficiently. The way it works is by getting every shape that is higher than the actual layer, and draws them, then this process is repeated but the layer is increased by choosing the altitude of the lowest shape on top of the current layer.

This is what creates the effect I said earlier, where taller bodies can cover smaller ones.

matiasah commented 7 years ago

I certainly managed to implement something that memorizes the shapes, the problem with this implementation is that it's impossible to know if a physics body changed or not because it doesn't track the position and rotation of every shape. So it doesn't check if any shape has changed, to reuse the last generated shadow.

And even if I could track their position and rotation, there are still other factors that affect on this, like, if the vertices of the polygon shape changed or not, the radius of a circle shape, etc.

As well as I did on the light world (InitFromPhysics) I could copy the data of a physics body with it's physics shapes once and then work with that using the Lua coded shadow shapes instead of physics shapes, this way I would be able to add tracking information on normal shapes (which I think I already did on my local folders).

Skeletonxf commented 7 years ago

I thought every tick each body uses its Transform to check if it's moved?, Additionally this is slightly simpler because box2d is a rigid body engine

Moving or modifying a shape that is on a body is not supported. The reason is simple: a body with 
morphing shapes is not a rigid body, but Box2D is a rigid body engine

page 8: http://www.box2d.org/manual.pdf

So you don't have to worry about physics shapes morphing at all, translation and rotation should be the only things that can change if I understand this right.

I think using one system of Shapes seems a lot cleaner.

matiasah commented 7 years ago

Well, that makes things much simpler. I'll try now to make it memorize shapes until the light or a body moves.

matiasah commented 7 years ago

I re-thought of this, and actually, bodies are rigid and their shapes can't be modified. But I still need to handle it when new shapes are attached to a body.

Skeletonxf commented 7 years ago

Ah. That should still be far less things to track right?

matiasah commented 7 years ago

Indeed, however it doesn't make it as efficient as it would if you added shapes once.

Let's say, every Light generates a table of layers, every layer is a table of bodies (index) that contains a table of shapes (index) that contains a table of every generated shadow shape.

If no shapes were added, I can reuse the entire body table, however, I will only be able to use the entire shapes table.

Not bad, but not great.

matiasah commented 7 years ago

Okay, I considered everything said before, but there's a additional factor.

I can't use shapes as indexes for O(1) access while recovering the previous calculation, and this is because body shapes are a full userdata object pointing to a löve C object.

Even though the C objects may always be the same, the full userdata objects are always different, therefore the index calculated on the table (hashed index) is different for every userdata object.

So with table[userdata] I won't be able to recover the previous calculated shape. I may need to hash the object and get it's index to do this.

matiasah commented 7 years ago

The code now memorizes shapes, could you try the performance again?

Skeletonxf commented 7 years ago

It can now consistently get into 100 - 105 fps on the main menu under no recording now, so fps there has gone up around 5-15. More importantly in the in game the fps can reliably stay around 100 which is 20-25 up on last time I tested, however some of this improvement is probably due to a bug that's appeared. When a light moves and a body stays still / motionless its shadow is not updated. If I get an enemy to keep hitting a wall, the wall shadow updates as I fly around it, but if neither my player body or an enemy body repeatedly hit the wall body its shadow does not update. Also when a body is deleted the game freezes and crashes but I imagine that's not going to be too hard to track down and fix.

matiasah commented 7 years ago

About the freezing, I think the binary search fails to correctly work when trying to find the object's position in the priority queue. In my environment it's using only two bodies, could you help me out with this?

About the light issue, are you sure that's not a Star object?

matiasah commented 7 years ago

The priority queue now implements binary search + linear search if there are repeated elements. So you now should be able to remove bodies without having the application freezing.

Does the shadows not being updating problem still persist?

I seem to be unable to recreate this issue.

Skeletonxf commented 7 years ago

I am sure I am using a Light and not a star. Now the most recent version throws an error instead of freezing, with the no updating issue still happening.

I tried recreating the no update issue in a dummy project with no success, so maybe I'm just doing something wrong or the issue is really hard to produce so I'm just attaching the .love of the real game. The shaders.lua file in the root directory handles all the calls to the shadows libarary, and the levelCreation.lua inside game/ handles most of the body creation, with the game.lua and player.lua handling a little bit too. Hopefully you can diagnose it from looking through game/ and the shaders.lua file as a lot of the other code areas are quite messy but should be unreleated to the shaders anyway.

https://send.firefox.com/download/9ce35550af/#Jqs3PkMj8I7VhkEi1vKSpQ

matiasah commented 7 years ago

I can guess what the problem is, bodies being removed during iteration. I am going to try to fix it, but I'm unable to recreate your issue with your very same project. Maybe it's not calling the update function for some reason? I don't know.

At the begining, the lights were deactivated so I had to set the enable configuration to true and it worked fine. Perhaps you removed a condition to check this out.

Skeletonxf commented 7 years ago

Oh sorry I forgot to tell you how to get to the level where the problems are most easily seen. Hold CTRL and 1 to jump to the demo level, with WASD to move and Space to shoot. There it's clear that some body shadows aren't updating properly.

Having to turn on shaders at the start is somewhat intended behaviour, though I thought the default was on for desktop OSs, basically that boolean is saved to a file so I forgot about it.

matiasah commented 7 years ago

Having physics shapes seems to be very complicated, I'm just going to add a InitFromPhysics method to the bodies so that it can copy shapes and their properties.

This way I'm going to be able to handle changes in the code much more easier, and get rid of the technical debt.

Skeletonxf commented 7 years ago

That seems like a good idea to me.

matiasah commented 7 years ago

I need to ask you, could you upload the file again? I switched from computer and I don't have it here.

Skeletonxf commented 7 years ago

Sure: https://send.firefox.com/download/bce6827d4c/#APXacaFXW0LrorLyZLtUeg

matiasah commented 7 years ago

Link seems to be broken again, any possibility you could upload it on another host?

Skeletonxf commented 7 years ago

Weird. I'll try one more time and if that still doesn't work I'll look for another host. https://send.firefox.com/download/970a0a5fa0/#UHMLUHFIbE1rUyJ5vRTyTA

matiasah commented 7 years ago

What I am sure of, is that it's transform is not modifying it's Transform.HasChanged attribute, because you've been using parented transforms and I haven't and that's why it breaks.

I need to figure out why it doesn't set it's Transform.HasChanged attribute to true.

Skeletonxf commented 7 years ago

Okay that would probably explain why my simpler demo couldn't reproduce it because I wasn't explicitly using Transforms there.