Quansight / lsst_dashboard

LSST Dashboard https://quansight.github.io/lsst_dashboard/
BSD 3-Clause "New" or "Revised" License
8 stars 3 forks source link

Optimized and improved skyplot and detail views #158

Closed philippjfr closed 4 years ago

philippjfr commented 4 years ago

This PR makes a variety of improvements to both the skyplot and the detail plot.

Instead of displaying datashaded+dynspread, rasterized and decimated views of the points this PR switches to using a rasterized view with a fixed sampling density (500 pixels by default) and switches to regular points when the data volume is below a certain level (10,000 points by default). This alone speeds up the plots significantly.

The second component of this PR is caching the ranges of the dataframe columns on initialization on the Dimension objects. This prevents HoloViews from computing these ranges over and over again, which it does to be able to normalize ranges across plots and frames (something I will probably look at in HoloViews itself in the near future). It also uses da.percentile when computing the value ranges to drop outliers.

Along with the recent fix to Panel templates which is available since Panel 0.9.4 this should significantly optimize the dashboard.

Note: This PR also includes @brendancol's fix for changing to persist throughout.

philippjfr commented 4 years ago

@timothydmorton It would be great if you could test this. I'd be particular interested if you could give feedback on the xsampling and ysampling values, i.e. if the rasterized plots are too coarse or not.

timothydmorton commented 4 years ago

Awesome, this sounds great! I will check this out first chance I get (likely not until later this evening).

timothydmorton commented 4 years ago

I am testing this on lsst-dev with all available tracts in /project/dharhas/DM-23243-KTK-Full.

  1. From when I click "load data," there is initial (right away) ~2s of dask activity (mostly "from-delayed"; I'm guessing this is loading via ktk). Then there is about 45s of 100% cpu running, after which dask starts again, taking about another minute before the first image shows up. Screenshot of dask graph during this minute:

    Screenshot 2020-04-07 11 24 16
  2. In skyplot view, when I resize the browser (Chrome) vertically (make it either shorter or taller), the skyplot window increases, out of frame, and keeps doing so whenever I make a resize action. There is no scrolling available for the window. When I refresh the browser, the skyplot is completely gone.

  3. A box zoom on the skyplot shows a quick flurry of dask activity, and then silence, but then no rerendering seems to happen, so I'm not able to test the changes that happen upon zoom-in. Seems to have same behavior regardless of chosen box zoom size.

    Screenshot 2020-04-07 11 33 19
  4. Switching then to detail view (after having box-zoomed), the skyplot shows up with the normal original axis ranges, but with only the region that was box-zoomed-to showing:

    Screenshot 2020-04-07 11 37 49

    Upon switching back to "skyplot view," we get the same view:

    Screenshot 2020-04-07 11 36 34
  5. The scatter plot in the "detail view" now looks very different, in that we can no longer see the locus of points at low psfMag and around 0 well, and we don't see the greyed-out non-selected points upon selection. e.g.:

    Screenshot 2020-04-07 11 42 53
  6. When zooming/panning the above scatter plot, there is dask activity and a re-render after a few seconds, but the dask graph looks very sparse:

    Screenshot 2020-04-07 11 43 46

    I wonder if this is doing something that can't execute on multiple threads.

@philippjfr @brendancol it would probably save time between iterations if you were able to test/debug this yourself on lsst-dev. There's clearly some difference between your local testing and the lsst system, perhaps in how dask works, or in some other packages, or something else that happens when scaling to larger datasets.

philippjfr commented 4 years ago

I'll try to set up a meeting with @dharhas and/or @brendancol tomorrow so they can walk me through running this on @lsst-dev. I'm perplexed by most of the issues you describe particularly 3, 4 and 5 as that's definitely not what I'm seeing locally.

timothydmorton commented 4 years ago

I upgraded to panel 0.9.4; are there any other required new package versions of anything?

philippjfr commented 4 years ago

HoloViews 1.13.2 and Panel 0.9.5 is what I would recommend.

timothydmorton commented 4 years ago

OK, will try with those

timothydmorton commented 4 years ago

OK, now I see (not sure if I tested this completely or not before) if I box-zoom very close, I do indeed get points. But it looks like outside of that point, the raster resolution still doesn't seem to be increasing; i.e., it goes from very blocky like this straight to points. Is this what you see, too?

Screenshot 2020-04-07 12 10 56
timothydmorton commented 4 years ago

Also, still seeing this when switching to detail view after having zoomed

Screenshot 2020-04-07 12 16 23
philippjfr commented 4 years ago

But it looks like outside of that point, the raster resolution still doesn't seem to be increasing; i.e., it goes from very blocky like this straight to points. Is this what you see, too?

This is what I was asking you above about the sampling values, 500 is indeed pretty low but it was hard for me to judge as in the testing dataset it was already very sparse at that level. Sounds like I should bump that by a factor of 10x at least.

To explain more thoroughly, basically we take the range along an axis and divide that by some number N (500 by default) right now, this will now be the maximum sampling density and therefore the size of each pixel. It does seeem like we may have to compute the sampling density dynamically based on the total number of points.

philippjfr commented 4 years ago

Also, still seeing this when switching to detail view after having zoomed

This is definitely a bug, but let me just clarify the expected behavior. When you switch between the skyplot and detail views do you want the range to persist?

dharhas commented 4 years ago

Expanding on what @timothydmorton

If I zoom in skyplot view and switch to detail, it looks like the data range is being pulled from the zoomed in portion of the skyplot but the viewport (actual x, y limits) are based on the entire dataset.

If I now switch back to the skyplot view, the visible data is still from the zoomed in view but the viewport is now based on the entire dataset.

My 2 cents on options

a) Let's not link the ranges between the skyplot and detail view at all.

b) if we link them then we need to preserve the viewports on both plots if that makes sense.

timothydmorton commented 4 years ago

@philippjfr and I discussed this last night and discussed this bug, and I did tell him that linking ranges would be good. Let's keep that as long is doesn't complicate other things, and as long as the viewport issue can be fixed.