Optimize rendering performance

ComBatVision commented 2 years ago

Kotlin version of WorldWind works noticeably slower then original Android and JS code-bases. Especially it is visible in JS examples due to huge Kotlin to JS translation overheads like type checking, Java-like Hash Map implementation, equals and hash-code processing etc.

One general reason why Kotlin version feels slower is the re-projection of Mercator Image Tiles on the go. Default examples use EPSG:3857 tile sources from popular 2D map tile services like Open Street Maps, Google Maps, etc. and re-project them to EPSG:4326 required by 3D globe. Re-projection is a per-pixel modification of image 256x256 pixels size. This operation consumes much CPU especially in JS. Android caches re-projected tiles in GeoPackage and read re-projected tiles, but JS cache original tiles in the browser cache and re-project them each time on load. This problem can be avoided by using WMS EPSG:4326 tile sources.

The next most noticeable place in the code profiling is AbstractTiledElevationCoverage.scanHeightLimits() used by Tile.getExtent() function. This functionality recalculates map tile volume size (min and max height) each time when Elevation Coverage change its timestamp (new elevation tile is downloaded). For some reasons it is very CPU consumable operation and comparing to JS code-base it executes too much times. May be we need to recalculate tile heights not so often somehow. Need help to investigate this issue.

And the most wired thing is Kotlin JS overheads due to Java-like Hash Map implementation based on equals and hashCode functionality. There are to many calls of getStringHasheCode, Long.compareTo, Angle.CompareTo functions. Most of this calls appears due to the core of WorldWindKotlin engine is a RenderResourceCahce Hash Map, which is used to store Image Textures, Buffers, Shader Programs etc. The access to this cache requires to calculate hash code of Image Tile URLs and other key objects to get or put resources into cache. In JS profiling this operation appears among top CPU consumers.

Another issue related to previous one is specific Angle class implementation. Most geometry classes are based on Latitude and Longitude values which are represented by the Angle class - a wrapper over Double degrees value with special angle calculation logic in it. This approach was taken from Java code-base and is very comfortable during code development, but Angle value is immutable and each arithmetic operation with angles creates new Angle object instance, plus each comparing of angles uses Java-like equals and compareTo functionality, which is very noticeable in JS profiling.

And one more consumption is that Surface Path rendering uses heavy logic to project tile on terrain, because it loops over all terrain tiles. This is the reason why PathPolygonsLablesActivity example has so poor performance.

Dear community, we need your help with performance investigation. Feel free to give us an advice.

ComBatVision commented 1 year ago

One of the main issues of performance is too much tiles in a view to horizon. We need to speed up subdivision of tiles when looking into horizon and make frustum far plane dependent on camera altitude, because we do not need to render 160km distance when camera is 10m above the terrain.

ComBatVision commented 1 year ago

1) We download all upper level tile textures and load them into resource cache, not only the required level. This performance impact was done to avoid blinking of dark holes when you zoom out or rapidly swipe to the side and nothing to display until required level is loaded.

2) Tiled image level of details in Kotlin version was made the same as in Google Maps - one pixel of tile is equals one dp. Original Android repo had details 4x worth which caused 4x less tiles in memory.

3) Probably we need to extend buffer size of elevation and image tiles cache, to avoid unnecessary generation and bounding box calculations.

4) When any new terrain tile is loaded all bounding boxes invalidate and whole scene of tiles and surface objects recalculates min and max elevation from scratch.

5) Each image layer and surface shape or image calculated its own bounding box and makes frustum intersection instead of just checking if it intersect current terrain, which already intersected frustum in this frame. Surface objects cannot be outside terrain.

6) Calculation of min and max elevations may be incorrectly requiring more detailed elevation data then rendered terrain which cause to load more elevation tiles then necessary.

7) Placemarks clamped or relative to ground located on terrain outside frustum requires elevation model access to calculate exact altitude to avoid falling under the ground and to be visible in augmented reality on wrong position in space.

8) All placemarks transfer own leader line buffer into graphical memory by separate command instead of having one buffer with offsets and transfer it once per frame.

9) Subdivision of tiles is dependent on field of view and in case of very narrow FoV there are abnormal amount of terrain and image tiles generated.

ComBatVision commented 1 year ago

We have taken into account the cos of angle between viewing vector and tile normal vector during subdivision and it decreased significantly amount of terrain and image tiles on horizon. After applying this optimization some other become not relevant:

1) Downloading of all upper level tile textures and loading them into resource cache may be kept, as it does not influence performance any more as tiles count decreased from 400 to 50.

2) Tiled image level of details in Kotlin version may be kept the same as in Google Maps - now it produces 18-50 image tiles on the scene.

3) The buffer size of elevation and image tiles cache was decreased instead of increasing as amount of tiles becomes significantly lower.

4) Invalidation of only those tiles which intersects with just loaded elevation tile instead of invalidation of whole scene may be extracted in a separate feature request, but now it is not urgent.

5) Intersection of surface objects with terrain tile sectors should be extracted to a separate feature request. Currently frustum culling is more performance effective then checking n(m) times each renderable with each terrain tiles sector.

6) Calculation of min and max elevations was optimized by decreasing height limit samples amount from 64 to 32. It makes algorithm which determines height rand to use the same level of details as further terrain rendering. This approach causes some mountains to be miss-picked on the very top points, but it gives significant voost in performance, so we can live with this.

7) Determination of altitude for placemarks clamped or relative to ground located on terrain outside frustum was optimized bu caching last 50 000 of request of elevation by coordinates. It significantly increased Placemarks rendering.

8) All placemarks transfer own leader line buffer into graphical memory by separate command instead of having one buffer with offsets and transfer it once per frame. This issue has a separate topic on big tracker and should be redesigned in future.

9) Subdivision of tiles is dependent on field of view and in case of very narrow FoV tiles become very narrow and this situation produces abnormal amount of terrain and image tiles. But after considering cos to tile normal the efficiency of dropping tiles LoD makes this issue obsolete. It does not influence performance any more.

ComBatVision commented 1 year ago

Some nice to have performance enhancements were extracted to separate issues, but generally engine performance is suitable now, so this general task will be closed now, but may be opened later.

WorldWindEarth / WorldWindKotlin

Optimize rendering performance #2