holoviz / hvplot

A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
https://hvplot.holoviz.org
BSD 3-Clause "New" or "Revised" License
1.12k stars 108 forks source link

Explorer: add cancel button to stop processing #1218

Open ahuang11 opened 10 months ago

ahuang11 commented 10 months ago

Sometimes, accidentally click on a column name with over 1000 unique groups for by and then it takes forever unless I Cmd+C or restart kernel.

maximlt commented 10 months ago

I think we've already discussed that in the past. IIRC it's pretty difficult to stop things from running. I think we discussed removing columns from by and groupby that have too many unique values.

ahuang11 commented 10 months ago

I was imagining this https://github.com/holoviz/panel/pull/5962

MarcSkovMadsen commented 10 months ago

+1. I am trying to build a custom data catalogue inspired by Intake. I would like my users to be able to explore a source using the Explorer but if I select a datetime column in the the by or groupby then suddenly everything blocks for a long time and even crashes the client or server sometimes.

It would be very helpful with one or more of the below.

I think it could be possible to implement support for some limits to not include columns with .nunique>N in the by or groupby column.

maximlt commented 10 months ago

The explorer is pretty different from the chat components of Panel, it doesn't fetch data from the internet, which is indeed the kind of operation that can get very slow (e.g. rated limited API) and you may want to stop. Ideally the explorer should not have slow and blocking operations, as you can't explore data when things get too slow :) !

Sure we could try to add a cancel button to stop processing but before jumping on that kind of engineer solution I'd like to look into potential UX solutions. Like:

philippjfr commented 10 months ago

Agree with @maximlt, the solution is not to allow a user to back out of some nonsensical selection that will, at best, cause lengthy processing in Python or at worst crash your browser, but rather prevent users from making such selections in the first place. Indeed in many cases there is no backing out, often the actual Python portion of the processing finishes very quickly but once Bokeh is asked to render the output it effectively freezes the browser, at which point it's too late. I too suggest we do some of the following:

MarcSkovMadsen commented 10 months ago

I agree that we should try to limit to risk of a user selecting something that blocks.

But I also believe that it would mean a lot of the plot could be generated in its own thread by default. As it is now the explorer will not really work in a shared application because it will be blocking the main thread over and over again for 0.1-5 secs.

Maybe Panel should provide its own implementation of asyncify to run something async/ in its own thread in one line of code.

philippjfr commented 10 months ago

Very doubtful that threading hvPlot would meaningfully unlock the GIL.