StarlingGraphics / Starling-Extension-Graphics

flash.display.Graphics style extension for the Starling Flash GPU rendering framework
https://github.com/StarlingGraphics/Starling-Extension-Graphics/wiki
MIT License
283 stars 89 forks source link

[? Feature Request / Implementation question] #76

Closed AlBirdie closed 10 years ago

AlBirdie commented 10 years ago

I've got the following requirement; draw a Fill in between a certain range of vertices. Here are two images for illustration purposes; Both images show a white Stroke (ChartStroke) drawn with a set of vertices. This Stroke is crossed by a red line (LimitStroke). In between the ChartStroke and the LimitStroke sits the Fill. Now that would be quite easy to draw with the SEG if we knew the vertex where both lines cross, however, the vertical white lines in the two images show the vertex sequence, so there is no vertex where they actually cross, hence no way for me to draw the Fill properly, isn't there?

bildschirmfoto 2013-11-08 um 10 21 56 bildschirmfoto 2013-11-08 um 10 22 01

I've tried with the following code, but as you can see in the last image, the Fill can't be properly set to the position where the two lines cross since I don't have a vertex position for that. bildschirmfoto 2013-11-08 um 10 15 53

if(yWC > _overbought) { _overboughtFill.addVertex(xDC, yDC); } else _overboughtFill.addVertex(xDC, limitPos);

Now what I was thinking about is to not check if the current vertices y position is greater or smaller than the limiters y position, but to somehow add vertices to the Fill for the whole sequence (which would then properly match the ChartStroke) and then somehow limit the Fill's height, y pos (don't really know what exactly) to start at the limiters y position, so that the vertices for everything that sits below that limit isn't drawn. Would something like that be possible? If not, is there another way to accomplish this task without calculating the vertices where both lines cross (don't even remotely know how to do that to be honest)?

Edit: On second thought: a "getYBetweenVertices(startIndex:int, endIndex:int):Number" in the Stroke class would probably do the trick without the need to mess around with the Fill class, wouldn't it?

IonSwitz commented 10 years ago

I have a little bit of a hard time following what it is you wish to achieve... Sorry, I might be dumb :)

Could you create a small code snippet that recreates the situation ( or a similar one) that I could test and play with?

"getYBetweenVertices" seems like a very ad hoc solution, to be honest.

IonSwitz commented 10 years ago

I've looked into this, was hoping to be able to make a method that returns a vector of fills between two strokes, but it got complicated. ;)

No obvious, general solution springs to mind yet.

AlBirdie commented 10 years ago

You're right, yBetweenVertices wouldn't solve the issue at all since the two lines don't necessarily need to cross exactly in the middle of two vertices. I think the easiest way to solve this is for me to find out where the two lines actually cross and draw the fill accordingly, no change of the API required.

jonathanrpace commented 10 years ago

Yeah, I'm not really in favour of the graphics API becoming a geometry intersection library too :)

There is a slightly different angle you could take that could save you some CPU cycles. If you were to fill the graph as usual, but render transparent the area below the line, that would visually give you the same result.

This could be achieved with a texture with its top half opaque and its lower half transparent. You could then supply a transform matrix to the beginTextureFill function to cunningly shift the texture up/down based on the ratio between the min/max of your graph. You may need to scale along the Y axis too - or even draw the texture yourself on-the-fly using Flash's (standard) graphics API.

AlBirdie commented 10 years ago

That does sound like a good solution! Thanks for you input Jonathan and of course thanks to IonSwitz as well. Since I've got enough input now, I'm closing this one.

@IonSwitz I haven't forgotten about your SWC request! I'll just need another one or two days, sorry.

IonSwitz commented 10 years ago

@jonathanrpace Using the texture fill "hack" there will make the "precisionHitArea" calculation fail though, as that checks geometry, not texture transparency. Depends on if @AlBirdie wants to use that for the fill or not, I guess.

AlBirdie commented 10 years ago

Thanks for chiming in again IonSwitz, actually haven't thought about that. Though I current don't need the precisionHitArea for the Fill, it might become a requirement later on. Actually I came up with two other ways that might solve the issue. One would be to calculate the point where the two lines intersect (that would definitely work, it wouldn't be the most performant solution though), the other to apply a cliprect to the Fill. Not sure the latter would work with the graphics extension, but it would likely be the solution that requires the least processing power.

jarrodmoldrich commented 10 years ago

Hi AlBirdie,

I can't tell if the line you're intersecting with is always perfectly flat and that the squiggle is a graph of y = f(x). If so.. you could take each two points of the graph (x1,y1 and x2,y2) and find the intersection with a y value, for this example we'll say it's 10. So the equation for any line is y = offset + slope * x. Unless it's vertical, in which case it means you have a singularity in your graph!

slope = (x2-x1)/(y2-y1) offset = y1 - x1_slope y = offset + x_slope

let y = 10 and solve for x 10 = offset + x*slope (10 - offset)/slope = x

so x = (10 - offset)/slope where y = 10

I hope I didn't screw up my maths... but that should get you the intersection point.

Now you just need to test if x1 <= x < x2 to see if the intersection falls within your segment. If so, you should add an extra point to your fill geometry.

Hope that helps!

Jarrod

On 11 November 2013 22:49, AlBirdie notifications@github.com wrote:

Thanks for chiming in again IonSwitz, actually haven't thought about that. Though I current don't need the precisionHitArea for the Fill, it might become a requirement later on. Actually I came up with two other ways that might solve the issue. One would be to calculate the point where the two lines intersect (that would definitely work, it wouldn't be the most performant solution though), the other to apply a cliprect to the Fill. Not sure the latter would work with the graphics extension, but it would likely be the solution that requires the least processing power.

— Reply to this email directly or view it on GitHubhttps://github.com/StarlingGraphics/Starling-Extension-Graphics/issues/76#issuecomment-28192432 .

Jarrod Moldrich - Mobile Apps Developer mob: +61401686802

AlBirdie commented 10 years ago

Cheers mate! That's actually pretty close to the method I'm using to calculate the intersection points, I'm calculating offset and slope for both lines though (might become a requirement later and I thought since I'm already on it... ;). Suprisingly it doesn't measurably add up to the whole drawing time (that is with the check if there is an intersection), so I'm pretty satisfied with that solution.

Thanks again guys, appreciate your thoughts and solutions on this!

jonathanrpace commented 10 years ago

@AlBirdie - In case you didn't spot it - this commit adds a TriangleStrip primitive, which will be practically CPU free compared with the Fill primitive for drawing filled graphs.

https://github.com/StarlingGraphics/Starling-Extension-Graphics/commit/792c38347f33c6fcd3c489eaf2da34b17d82be08

AlBirdie commented 10 years ago

Thank you very much Jonathan. Going to try this out first thing monday morning!

Edit: Gosh I love this graphics api! Once again you've made my day mate. Can't thank you enough. Couldn't help myself and did a little testing already, awesome job Jonathan! :+1: This does cost pretty much nothing, can't even compare it to using a Fill. Seems we will have chart scrolling with live updates and "Fills" after all. :) Now I just have to figure out how to clear that thing properly without running into the Buffer too big error. ;)

jonathanrpace commented 10 years ago

Good to hear!

Hopefully you can see from the code that there's really not that much involved with creating a 'custom' triangulated object. In principle there are all kinds of shapes you can triangulate trivially, as long as you know ahead-of-time the kind of shape you'll be drawing.

If things get even more CPU-tight, you may want to consider using this triangle strip class for your strokes too. It won't do any of the nice 'mitering' at the joints that you get with the Stroke class, but if your use-case is restrictive enough (thin line-graphs), you may get away with it.

jarrodmoldrich commented 10 years ago

Hey folk!

Might have missed something. But on looking at Stage3D a little while ago I discovered that it was only capable of indexed triangle lists, and so triangle strips provide no performance benefit as they would have to be turned into individual triangles anyway. Did this change? Or is this new triangle strip primitive an abstraction for convenience sake?

Just curious :)

Jarrod

On 29 November 2013 22:56, Jonathan Pace notifications@github.com wrote:

Good to hear!

Hopefully you can see from the code that there's really not that much involved with creating a 'custom' triangulated object. In principle there are all kinds of shapes you can triangulate trivially, as long as you know ahead-of-time the kind of shape you'll be drawing.

If things get even more CPU-tight, you may want to consider using this triangle strip class for your strokes too. It won't do any of the nice 'mitering' at the joints that you get with the Stroke class, but if your use-case is restrictive enough (thin line-graphs), you may get away with it.

— Reply to this email directly or view it on GitHubhttps://github.com/StarlingGraphics/Starling-Extension-Graphics/issues/76#issuecomment-29512136 .

Jarrod Moldrich - Mobile Apps Developer mob: +61401686802

jonathanrpace commented 10 years ago

Just an abstraction - it still uses indexed triangle lists, but the way the indices are generated produce something in a similar vein. The performance benefit here is primarily a CPU one, as it can be used instead of going through the Fill primitive's general purpose triangulator.

AlBirdie commented 10 years ago

Jonathan, I'll certainly try this with my strokes as well! It would be nice to get the same performance boost as from going from Fill to TriangleStrip. That difference is just unreal. :+1:

If you like you could add the clear() method to the TriangleStrip class;

public function clear():void {
           // since the Vectors have already been created, I went for just setting their length to 0 instead of creating new ones
        vertices.length = 0;
        indices.length = 0;

        if(minBounds)
        {
            minBounds.x = minBounds.y = Number.POSITIVE_INFINITY; 
            maxBounds.x = maxBounds.y = Number.NEGATIVE_INFINITY;
        }

        numVertices = 0;
        isInvalid = true;
    }

Or just give me a heads and and I'll put a pull request together.

IonSwitz commented 10 years ago

Very nice. I am now looking into allowing for the precisionHitTest to work on the triangle strips as well, as I know that you use the precision hits, AlBirdie.

To avoid the winding problem, I implemented another IsPointInTriangle function, IsPointInTriangleBarycentric, to use the barycentric calculation. There's a bit more math involved, with no easy outs, but it means I don't have to bother with the winding.

Please try it out.

I added better support for _precisionHitTestDistance as well now.

AlBirdie commented 10 years ago

Wow, thanks mate! That sounds very promising, going to try it later this week.

IonSwitz commented 10 years ago

I've added a solution to the precision hits, so go ahead and make the strips as crazy as you can :)

AlBirdie commented 10 years ago

Don't you worry, since these are stock charts, those strips can assume pretty ridiculous shapes. ;) Just give me till the end of the week, have to finish a release first.

AlBirdie commented 10 years ago

Sorry but this has to wait till next year, unfortunately. But I'm fairly certain that your implementation is sound, at least I haven't found a case where this wasn't the case. :)

AlBirdie commented 10 years ago

Just a heads up, QA didn't find any issues with the hit test on the TriangleStrips. :) Thanks for that feature @IonSwitz !

@jonathanrpace, Unfortunately TriangleStrips don't work properly as a replacement for (thin) strokes. Even with a 1px stroke you'll end up with loads of missing sections.

IonSwitz commented 10 years ago

@AlBirdie - Glad to be of help. I have been thinking about ways to optimize the hit collision test, as the approach now is fairly naive, but it would take more effort to implement and unless the triangle strips get very long, it might not be an issue at all.

AlBirdie commented 10 years ago

Personally I'm more than happy with the solution as it is. Currently I don't have any requirement that would require anything to be changed with the hit test and I don't see one coming in the future to be honest, so unless others need this feature or such optimisation currently tickles your fancy, I wouldn't bother. :)

Instead, a little boost from the Stroke class would be aces. As I said, TriangleStrip won't do the trick as a Stroke replacement. But then again, I'm not asking for anything here! We all are more than lucky to have this awesome extension, and considering that you guys are doing this in your free time I don't really feel good asking for features. So if you have time, or better yet, need this yourself, a faster stroke would be nice to have. If there isn't a way to optimise it, that's fine as well.

IonSwitz commented 10 years ago

Not promising anything, but in what scenarios do you get issues with the Stroke class and it's performance?

Could you give me a small sample code to work on? Something that slows your system down without taking 4 years to implement? :)

AlBirdie commented 10 years ago

The scenario is when I'm adding several Strokes (10+), each with 500 vertices, and try do completely re-draw those every frame. Although that does work with nearly 60fps on newer devices (with even more strokes), older devices (that are still being sold today, thanks Apple) tend to struggle with it.

I think I'm gonna make a small example project with some kind of line chart that allows you to dynamically add strokes to it. That's gonna take at least till next week though. Pretty busy with other tasks right now.

IonSwitz commented 10 years ago

When you say "completely re-draw", do you mean that you add the 500 vertices to each stroke each frame, or that you generate the geometry once per Stroke and merely render the Strokes every frame?

Also, do you use "lineTo" or "curveTo", ie straight lines between the vertices or do you draw curves?

Depending on these answers, I might have more to go on. I love optimization problems, but I want to have a good reproduction case to work from before I start :)

AlBirdie commented 10 years ago

In that case let me finish (the weekend is close :)) the example project first. It's tough to explain without you having the opportunity to look into the actual code.

This is basically the draw method for a single line. The loop goes through all the values (up to 500), calculates the position of the vertices, and then adds the vertices to the stroke.

function draw() { _trixStroke.clear(); for (var j:int = startX; j <endX; j++) _trixStroke.addVertex(xDC, yDC, mainLineThickness); }

draw() itself is being called on change of a slider, so I guess that isn't really every frame, I'm not really sure though. At any rate, it is being called quite frequently.

Again, there is no big deal for a single line, but make it 10 or 20 (depending on the device) and you'll bring even the fastest devices to their knees. Regarding straight lines or curves, these are all straight lines between two points, pretty much like the google finance chart: http://www.google.com/finance?q=apple&ei=12a2Ubj4M8vcqAGdmQE

IonSwitz commented 10 years ago

I made a guess and set up this sample:

private function onEnterFrame(e:EnterFrameEvent) : void { for ( var i : int = 0 ; i < 10; i++ ) { var s:Stroke = _strokes[i]; s.clear(); for ( var j:int = 0; j < 5000; j++ ) { s.addVertex(100 + Math.random() * 1000, 100 + Math.random() * 800); } } }

It turns out I came pretty close to what you were doing. :)

And, yes, I have found one interesting performance hit. In the method createPolyLinePreAlloc in Stroke, there were two Function "pointers" being used:

var sqrt:Function = Math.sqrt; var sin:Function = Math.sin;

so, instead of calling Math.sqrt directly, the Function variable "sqrt" was used.

When I removed this (I have checked in the code already) and called Math.sqrt and Math.sin directly, performance was markedly improved, and most surprising, the time spent in Garbage Collection was reduced, according to Scout.(not sure how this matter).

I went from a ~45 FPS scenario to a near-60 FPS scenario on my test machine here after the fix, so please try this stuff out.

IonSwitz commented 10 years ago

BTW, the same peculiar notation is used in CurveUtil , so I think this can improve the performance of curves as well. However, I don't want to check in that stuff before I have a good verification test set up.

IonSwitz commented 10 years ago

I have now added the same fix to CurveUtil. The performance boost for large number of curves was quite significant, calculation times for this:

_shape.graphics.clear(); _shape.graphics.lineStyle( 1, 0xFFFFFF );

        var startX:Number = Math.random() * 500;
        var startY:Number = Math.random() * 500;
        _shape.graphics.moveTo(0, 0);   
        for ( var j:int = 0; j < 500; j++ )
        {
            _shape.graphics.curveTo( startX + 100 + (j * 0.1) , startY + 100, startX + 400 + (j * 0.1) , startY + 400 );
        }

was cut in half, more or less.

Significant improvement, at least on the PC. Mobile devices still remain outside of my ability to verify.

AlBirdie commented 10 years ago

Wow, you surprise me once again! Certainly did not expect such a quick response, if any, especially not with an optimisation already. :+1:

I hope I can get to that tomorrow, I've got a few backend issues at the moment which have to be sorted out first. Believe me I'd test it right now if I had the right stuff at home, can't wait! :)

IonSwitz commented 10 years ago

Depending on your needs, it might be beneficial to add the ability to only update a certain range of vertices.

It seems annoying that we have to redraw a 500 vertex Stroke completely when we add the 501st vertex. However, this would require more effort, and might not at all be what you need?

I assume all points of the stroke is changed when you need to redraw the Stroke?

jonathanrpace commented 10 years ago

Wow - that is a suprising discovery. I created those local references to sqrt and sin to avoid a static lookup - a technique that I'm fairly certain used yield improved execution time, though this may have been in the old AS2 days.

Guess it goes to show you should test your own profiling assumptions now and again :)

IonSwitz commented 10 years ago

It might be worth it for variables: http://jacksondunstan.com/articles/1690

But for functions, I think there is a lot of run time checks for parameters, number of parameters, etc. In a call to a Function variable, the system needs to check the number of arguments, probably box the Number into a new Object, etc. (That might be why the GC is running like crazy, btw)

I was, however, as surprised as you are. :)

AlBirdie commented 10 years ago

@IonSwitz Very good catch! Theoretically there are cases where not all vertices change, like when you've got a line displaying 100 items that range from 20 to 80 and you add an item with a value > 30 or < 80 and remove an item in the same range. That way you'd only need to add that last vertex and remove the first vertex, since the min and max values remain the same. In case the min/max values do change, which I'd say happens at least 9 out of 10 times, you'd need to redraw the entire thing, though. Considering that only some frames would benefit from this kind of optimisation, I personally wouldn't go into that unless it is a small change that can be done in minutes, but since you've already said that it isn't, I guess we can drop that. But I like your thinking, I haven't thought of that to be honest.

Regarding the sqrt function call, I just did a test with your sample code above since I don't have access to real life data at the moment. These are the results on an iPad2 iOS7.04, AIR 3.9, ASC2.0, with 10 Strokes;

With sqrt/sin variables: around 12 fps Direct use of sqrt and sin: fairly stable 18.6 fps.

:+1: Did not see that coming from first runs in the emulator. Impressive improvement for such a little change. As IonSwitz already pointed out, the improved Stroke class is much less heavy on the GC, which you can actually see in the fps as well. Whereas with the old Stroke class fsp would drop to some low 11s about every second, the improved version not only doesn't drop as much, it doesn't do it as often either, only about every 3 seconds.

I don't expect other iOS devices to behave differently, so I didn't bother testing them. However, I will certainly do a test with real data as soon as possible and hope to see such a big improvement as well.

IonSwitz commented 10 years ago

Still, though it's better than 12 FPS, 18.6 FPS is sad :(

It is also possible that the code that adds vertices (addVertex) could be boosted some as well, but the bulk of the rendering time (from Scout) is still in the draw part of the code. I will check further when I have time.

AlBirdie commented 10 years ago

Excellent. Again, thanks for your efforts mate, really appreciate it!

IonSwitz commented 10 years ago

After some further Scouting, reducing the number of "if"-statements in createPolyLinePreAlloc, skipping a function call when it's not needed (setGeometryInvalid in AddVertex) and not reallocating the vertices vector under some specific circumstances (old and new length are the same), this test case:

    for ( var i : int = 0 ; i < 10; i++ )
        {
            var s:Stroke = _strokes[i];
            s.clear();
            var startX:Number = Math.random() * 500;
            var startY:Number = Math.random() * 500;
            for ( var j:int = 0; j < 5000; j++ )
            {
                s.addVertex( 100 + Math.random() * 1000, 100 + Math.random() * 800, 1, 0xFFFFFF, 1, 0xFFFFFF, 1);
            }
        } 

Was improved from 8,4 FPS to 10.6 FPS, about 25% improvement.

For more real-world scenarios, this might be a lot less, but some boosts should be had none the less. For a real world scenario, I am guessing about 10%.

IonSwitz commented 10 years ago

It should be mentioned that the frame rates are measured from inside Scout, so it comes with the overhead of Scout in all scenarios. Running fully "free", with just the Flash Player, Release 11.8, measuring the FPS with Fraps, I get about 16 FPS before and about 20 FPS after these fixes, still about 25% improvement, so it seems pretty legit.

AlBirdie commented 10 years ago

Alright, finally sorted out the middleware issues and did some testing with the improvements IonSwitz implemented.

The results pretty much reflect what the simple test cases here already showed, I've got a pretty solid 50% framerate increase in total, which is a very solid result. Highly appreciate what IonSwitz has done with this!

On identical iPad2s with the very same configuration, framerate went up from 10 to 15fps for 14 Strokes at 350 vertices each. Although 15 isn't good per se, this scenario is very rare (no one would possibly display that many strokes at the same time), and now the whole thing became usable. At 10 it just wasn't.

So yes, in this (almost) worst case szenario, in theory users can add another two strokes to that in the very near future, we are still only half way towards our goal of 30fps, but since this is on the slowest device we are going for it isn't that bad. The current iPad Mini can do the same task at 45fps, which makes me very happy. Not sure if there is still room for improvement here, that's something you guys have to decide.

IonSwitz commented 10 years ago

Glad you get similar results in "real life" as in the test cases. Excellent! :)

The question now becomes: Can you do something to bump the performance on your end? Not knowing how your strokes look, and how close the data points you use are to eachother, but I would consider adding a distance test to the vertex points I add, if I were you.

Basically, instead of doing:

addVertex(newX, newY....);

do:

if ( distanceBetweenPoints(newX, newY, previousPointX, previousPointY) > minDistanceValue ) { addVertex(newX, newY....); previousPointX = newX; previousPointY = newY; }

Basically, if the distance between two points is too small, don't add it as a vertex.

I am assuming that, if you have multiple Strokes consisting of 350+ vertices each, they are bound to be fairly close to one another, otherwise the screen will look nightmarish. ;)

Since I don't know your data set situation, maybe this is stupid advise, but I think it could be a way to reduce the amount of vertices you add to the Stroke.

AlBirdie commented 10 years ago

Pretty happy with the 50% boost as well. Since the vertices are usually fairly close to another, in theory that does sound like a solid optimisation. However, given that these are stock charts, I have to draw each and every vertex. I can already see bug reports rolling in in case vertices are missing. I've send you a mail explaining my use case in a little more detail and hope that it clears things up to what I'm really after here.

IonSwitz commented 10 years ago

I saw your E-mail and I understand the issue.

As an alternative, I would look into skipping vertices where the tangent of the curve changes too little to be noticed. It takes a bit of more math, but basically, if you calculate the inclination from point A and point B , and then from point A and point C, and check the difference between those two. If they the difference is small "enough", B could be skipped.

But, yes, I do also understand your problem, so maybe it won't work very well either.

AlBirdie commented 10 years ago

Actually, after analysing some strokes that could work. I'll definitely look into that. More calculations shouldn't be an issue. Calculating the vertex points is already a pretty hefty task one might think (in fact there's a lot of math involved), but it takes almost no time at all.