mysticfall / pivot4j

Pivot4J provides a common API for OLAP servers which can be used to build an analytical service frontend with pivot style GUI.
Other
128 stars 101 forks source link

Optimze table rendering #222

Closed d-amelin closed 6 years ago

d-amelin commented 6 years ago

I have a query returning 600000 cells. I want to export result to excel. Export was performed to long and consumed to much cpu.

  1. Profiller shows that hotspot is org.pivot4j.util.TreeNode.getWidth https://github.com/mysticfall/pivot4j/blob/master/pivot4j-core/src/main/java/org/pivot4j/util/TreeNode.java#L155 The method called recursively many times. Much more than count of cells. The same TreeNodes width is calculated many times for different parents. We need "cache" result ofTreeNode.getWidthand reuse calculated value for subsequent calls. TreeNode.getMaxDescendantLevel` is not so hot spot as getWidth, but is hot too. We need "cache" it too. After doing that export time and cpu consumption drammatically decreased. More than 10 times.

We need "invalidate cache" when children collection changed. It is easy because getChildren returns unmodifiableList so We have control all modifications of children;

  1. Memory profiler shows tonns of garbage UnmodifiableRandomAccessList produced by getChildren(). https://github.com/mysticfall/pivot4j/blob/master/pivot4j-core/src/main/java/org/pivot4j/util/TreeNode.java#L215 It is immediately collected. But that volumes of garbage triggers to many gc. So export performed slowly. Imho, in this case, we should allocate private List<TreeNode<T>> unmodifiableChildren = Collections.unmodifiableList(children) at creation time and return it with getChildren. Rather than allocate on each call.
mysticfall commented 6 years ago

Fixed with #221.