observablehq / feedback

Customer submitted bugs and feature requests
42 stars 3 forks source link

to name a cell should not re-run cell / not re-fetch data? #336

Open tx0c opened 2 years ago

tx0c commented 2 years ago

usually when exploring a dataset, we're not starting all cells with names, but start many cells unnamed, like many db.sql... to get a piece of data; later on, some cells are found useful and need a name for plot graphs or something, we have to change it to data = db.sql... then it triggers a re-run the sql and re-fetching,

here is to request the UI has a way to name a cell, without re-run the cell, I suggest it can be in the menu like name the cell? and then it can pop a input box for user to type a name, then change the cell content to name = ...original-content...

for many reasons like:

  1. the naming time-point might be some hours or days later, the original database has changed, re-run is returning a different result;
  2. or just save the time of re-run, because some of my SQL is already requiring minutes to re-run

image

Or, if you can detect a cell's change is naming only, and control it not to re-run/re-fetch, that's also okay; like from unnamed (expression) to give a name dataname = (same expression) or rename to a different name

CobusT commented 2 years ago

Interesting suggestion! Curious about your 1st case though... do you typically keep your browser open on the notebook for days without refreshing?

tx0c commented 2 years ago

yes, and since the same notebook is growing bigger and bigger, might contain 100+ cells, then I found if close/re-open or refresh it, the loading would take many many minutes (or hours), or some would fail (server overloaded or other reasons); can I have another feature request the initial opening of a notebook not to request all data at once? can the saved DB connector have a RateLimitQueue behavior, like 5 concurrent queries, and/or with a time-window, like 10 queries/second? Or, I found the safe mode https://observablehq.com/d/<note-id>/safe is able to stop auto-loading, but, cannot execute any db query either, click each cell's run-button is doing nothing; can there be another less safe mode to just not auto-load? I will manually click each cell's run-button base on need

the alternative becomes the way I am currently using, because laptop nowadays support graceful suspend/resume, it's not uncommon some laptop not reboot for months or even years; similarly, same browser same tabs can be kept open to continue work on for many days, weeks or even months, all depends on how long will the data viz project need

CobusT commented 2 years ago

Sounds like a massive notebook. I would recommend controlling the loading of the various parts with functions that only query when you set flags. You can use a button to do this (or any other type of Input that works for your case. See this notebook for an example: https://observablehq.com/d/8aaaa1f253017d47

tx0c commented 2 years ago

that button is another alternative; but if doing all 100+ cells with its own extra button and button wrapper, it's a lot of unnecessary dup code / boilerplate

here's requesting such feature should be builtin

if Ob is doing such feature builtin, I believe it can manage the cell dependencies much better:

I suggest it can be via some kind of query parameters: https://observablehq.com/d/8aaaa1f253017d47?stopautorun

this is especially useful and wanted in team work, when presenting to others; and, I'm seeing Ob also supports BigQuery/Snowflake which we want to test Analytics in the near future with Ob, such platform is consuming more resources and credits, as we had some BigQuery queries taking minutes to run, would definitely need finer control of what cells to run or not to run;

another idea is to have all unnamed cells have some sort of default auto allocated temp names, and rename it should not re-run cell

CobusT commented 2 years ago

I think it is worth exploring not rerunning a cell when the cell name changes from unnamed to named.

tx0c commented 2 years ago

changes from unnamed to named

and should also include renaming? generally:

from:

expression

to:

name1 = expression

or rename from name1 to

name2 = expression

should not trigger re-running? you have a parser can know the right side expression is not changing, then not to re-run?

especially with BigQuery / Snowflake, some queries can be very big and expensive

tx0c commented 2 years ago

Sounds like a massive notebook

update for this: these notebooks are 30 ~ 60 cells mostly, not massive, I don't see 100+ cells notebook