maxkfranz commented 10 years ago

If style hasn't changed, then just draw the cached texture.
Should be cheaper than procedurally drawing the element for the frame. Blitting is cheap.
WebGL would further help blitting performance

maxkfranz commented 10 years ago

Only really helps with nodes, as edges almost always differ. And nodes aren't the rendering bottleneck.

It could help with labels, but for cases like zooming they would constantly be invalidated or blurry.

Edit: It would help with zooming and panning, which are the most common user operations. This requires keeping cached images of the elements at various sizes.

maxkfranz commented 9 years ago

Use a debounce on zoom events to invalidate the label caches. That way, zooms are fast and the text is made clear quickly afterwards.

Edit : Zooming still invalidates the cache, but only partially and possibly -- because there are different sizes cached. Changing the style or data invalidates the entire cache.

maxkfranz commented 9 years ago

TODO clean up w.r.t. label alignment (needs to be manual now) etc

maxkfranz commented 8 years ago

After work done on #1194, a prototype estimate (without accurate element images but static random images) results in a performance increase of ~3-4x using canvas to blit the images

1194 improved perf by ~1.5-1.6x

Total improvement estimate with canvas: 4.5x

WebGL blitting of the images could be faster; this jsperf estimates 7-8x faster than canvas for blitting => 4x relative improvement to rendering over canvas impl => ~18x total speedup using webgl

maxkfranz commented 8 years ago

Depends on #1226

maxkfranz commented 8 years ago

Requires removing the hide labels during viewport operations rendering hint, because it conflicts. Considering that this option was just a performance-related rendering hint and this cache adds a lot of performance, I don't think that's an issue.

maxkfranz commented 8 years ago

The simple implementation with images stored per element per scale is OK. It proves that drawing performance can be increased as expected with the cached images.

However, performance degrades heavily beyond some large graph sizes. It seems that having so many different textures/images/caches is expensive beyond some point. In order to mitigate this, a caching strategy with a smaller number of larger sized textures will be used.

Each texture is a wide strip, with fixed height and width. The overall cache is divided into levels. Each level has a height associated with it.

An element is placed in the strip with the smallest height that is greater than the element's height so as to minimise wasted area in each texture -- while keeping the algorithm for doing so straightforward and cheap. In the worst case, the wasted height is the difference between successive level heights.

Additional strips are allocated in each level to accommodate more elements. A strip is discarded when it is no longer efficiently using its space (i.e. many cache invalidations on it).

This cache strategy minimises the number of textures and minimises the wasted space in the textures. This should help with the performance drawback noticed so that the performance gains are more consistent.

maxkfranz commented 8 years ago

The approach in the previous comment proved successful. It works well even when large numbers of elements are added to the graph. The results are in line with the previous calculation estimates at ~3.5x speedup over 2.6 (with canvas only).

Regardless of whether the shared or individual approach is taken, there are hiccups (framerate drops) when calculating the textures for all elements at once on zoom and pan etc. Calculating the textures in smaller, queued batches may help. The next-best texture could be used in the interim.

Next, invalidating the cache for things like style changes needs to be added.

maxkfranz commented 8 years ago

Invalidation working well. There are glitches at the border of an element's bounding box -- but that's probably just a clipping issue.

maxkfranz commented 8 years ago

The canvas-only implementation of this caching feature is working well, and its performance improvement is in line with my previous calculations. Some parameters may need to be tweaked, but the algorithm itself is working well.

Next is adding WebGL for blitting

maxkfranz commented 8 years ago

Correction: Before WebGL, there should be experimentation with using temporary upscaling to prevent hiccups for cache misses (e.g. during zoom in).

maxkfranz commented 8 years ago

Now, there's an approach with queuing to prevent hiccups. It uses a queuing mechanism for textures with temporary fallbacks on previous caches. The queuing has cutoffs in terms of its performance impact per frame to make sure that the framerate stays smooth.

There are also fixes so that edges don't reverse the performance benefits of caching.

Now, WebGL should be able to be addressed. Changes to the algorithm as it stands will likely only be tweaks to the parameters from now on.

maxkfranz commented 8 years ago

Edges present an edge case in the caching mechanism, because they are often large. This means using very large textures for many edges, which defeats the performance gains of the cache. In order to improve edges, a layered approach will be used on top of the per-element cache. This allows elements to share a common area by allowing for overlap, thereby increasing texture space efficiency.

The simplest case of the layer cache is one layer. In this case, the layer is a texture of all elements. This texture can be reused on zoom and pan operations to reduce the number of draws.

This can be expanded to multiple layers, where the elements are batched together in one of N layers -- based on draw order. The multiple layer approach has benefits in terms of cache invalidation. However, it requires more draws of large images. Testing will show what number of layers is optimal. (WebGL may allow for more layers, thereby mitigating invalidation cases.)

The layers are segregated by scale level so that rendering is crisp.

Invalidation and testing are needed for layers. However, initial tests show a ~23x performance improvement for pan and zoom actions compared to 2.6.x. This is using canvas-only so far. I suspect WebGL might make this even faster... (23 x 8 = 184)

maxkfranz commented 8 years ago

Tweaking the parameters of the cache and some other changes have improved the performance of the layered cache considerably. Graphs of size ~12,000 elements render at full screen update speed. On my machine ~12,000 elements render at effectively 1,000fps.

This means that we can afford to use multiple layers, which will help a lot with expensive interactions like dragging elements and animations. It will also help with the on-screen fill rate for texture upgrades.

maxkfranz commented 8 years ago

To make the layers fast for large, spread out graphs, it looks like we need to share a common texture for all layers. Because the layers have the same relative dimensions and are in powers of two, they can be organised like so:

file 2016-04-07 15 41 30

There isn't another meaningful option for the placement of the second largest layer, and the remaining layers can follow in rows, filling each row until it runs out of horizontal space. This wastes some space, but it keeps it simple -- and the 1.5x1 total area used is minimised anyway.

This assumes one layer per level

maxkfranz commented 8 years ago

The shared texture for all layers is not efficient, because it's slow to do the drawImage() slicing. I suspect the hiccups from large graphs may be due to element texture refinement: It may be too many texture switches on the GPU in the background.

Rendering only the largest layer and filling the lower levels from the top one may help. This would avoid misses using the lower levels of element texture caches, meaning that we would switch less often.

It would also avoid scaling expense: The canvas tends to lose a lot of performance when scaling by a factor of less than 0.5 or more than 2. Each step down is exactly 2, and we avoid expensive cases where misses would require scaling >2 or <0.5 -- because we never miss middle layers.

We may also just have to tweak the max layer area smaller for browser like Firefox and Edge that seem to choke on larger values.

maxkfranz commented 8 years ago

~~TODO use min-zoomed-font-size in all of the "hit" layer~~

maxkfranz commented 8 years ago

The experiment for a shared texture for all layers failed. However, the experimental branch did have some other changes that were ported to unstable today. This improves performance for cases where the old implementation would hiccup.

maxkfranz commented 8 years ago

~~TODO box selection seems to distort / give bad bounding boxes for nodes~~

maxkfranz commented 8 years ago

~~TODO there seems to be an issue with element texture recycling; reproducible on the Cola demo~~

maxkfranz commented 8 years ago

~~TODO Image export (e.g.cy.png()) has degraded quality~~

maxkfranz commented 8 years ago

~~TODO hide options (labels, edges) on viewport not working~~

cytoscape / cytoscape.js

Cache element textures/bitmaps #593

1194 improved perf by ~1.5-1.6x