Closed bprusinowski closed 1 year ago
@dabo505 can you put this issue on the fast track? we depend on this performance improvement for the next release of visualize...
Hi @bprusinowski. Could you share example queries which are actually impacted? With and without the dimensions being sorted. I'm surprised that it should have any effect on the performance of querying the view
Hi @tpluscode, the performance of the query is not affected, but we need to create the View
before we fetch the observations – and by default, the dimensions are sorted, which takes around 0.5s in the query used for testing. So this change would not improve the performance of the query, but performance of creating a View
:)
By all means, do not hesitate to send a PR. Maybe with some additional examples because I do not yet fully understand the impact and where the actual time penalty occurs
Sure, this is the benchmark I've made to pin-point the issue.
With sorting enabled (see that --sort
took 549 ms and in total it took 556 ms). Generally it fluctuates around 500-600 ms in the example used for testing (this chart https://int.visualize.admin.ch/en/v/hKkUyBpGpWPl?dataSource=Int).
With sorting disabled (see that in total it took 8 ms).
Could you grant me the appropriate rights to open a PR in the repository? Thanks!
Could you grant me the appropriate rights to open a PR in the repository? Thanks!
Actually, I don't have the necessary permission either. Please fork and PR from there
I've opened a PR (#96) 👍
Released in v1.12
Hey!
Recently I've been working on performance improvements in Visualize.admin and I've been looking at the performance of the data fetching. I've noticed that in some cases it takes much longer to create the
View
(View.fromCube
) than to fetch the data itself – see the Comments section of this PR.It turned out that it's "slow" because the dimensions are sorted by name when the View is created. I wanted to ask if it's necessary to sort them or if it's possible to add an option to enable / disable sorting?
I've made some tests and it seems that there was no impact on the generated observations query when adding such parameter (additionally, in Visualize we sort the observations and dimensions anyway in the front-end part). I would be happy to open a PR if this change makes sense :)