iterative / vscode-dvc

Machine learning experiment tracking and data versioning with DVC extension for VS Code
https://marketplace.visualstudio.com/items?itemName=Iterative.dvc
Apache License 2.0
197 stars 29 forks source link

Improve selecting experiments for plots visualization #5104

Open mnrozhkov opened 11 months ago

mnrozhkov commented 11 months ago

o show plots in VSCode, people need to select experiments to which they want to show plots. There are a few points we could improve to make UX better. This issue discusses some ideas.

Currently, there are two issues that require attention.

  1. The experiment for the "workspace" is currently not being selected by default.
  2. The number of experiments that can be visualized in plots is restricted to 7.

What is the problem?

Step 1 - After the first experiment run, I select my new experiment [rindy-kobs] to show plots. It works fine!

image

Step 2 - I run a new experiments, and see a second plot added from my new [based-tarp] experiment. So far it's great!

image

Step 3-5 - I run 5 more experiments... everything looks good

image

Step 6 - After more then 7 experiments, I can't see new plots automatically -

image

Starting from this point, the current behavior is not convenient anymore. When running a new experiment, I expect to see the corresponding plot - it's the first priority. Instead, for every new experiment, I need

In my mind there are 2 scenarios that might improve the UX:

Scenario 1 - Only manual selection

  1. Display plots only for the most recent experiment (workspace) by default.
  2. Every time a new run is initiated, the previous experiment is automatically deselected. (This prevents the automatic increment of selected experiments.)
  3. To ensure automatic plotting, the user must choose an experiment to keep it selected.

Scenario 2 - Automatically select the most recent experiment after 7 runs

  1. Keep the current behavior until 7 experiments selected.
  2. After that, for every new run initiated, the previous experiment is automatically deselected. Automatically select the most recent experiment
  3. Keep "manual" selection for subsequent experiments image

WDYT? How do you feel about this? Looking for your ideas and feedback 🙌

mattseddon commented 11 months ago

@mnrozhkov how do other experiment-tracking tools handle this functionality?

The simplest thing to do would be to remove the auto-selection logic. No experiments would be selected/de-selected unless the user performed that action. All of the other options seem overcomplicated and I don't think we'll end up with something intuitive or valuable enough to warrant the required complexity in the codebase.

I have seen you mention "selecting the workspace by default" in both this issue and #5087. I think the case of no experiments being selected is handled pretty well by this screen:

image
SoyGema commented 11 months ago

how do other experiment-tracking tools handle this functionality?

In WANDB You can select experiments ad infinitum, however, only the last 10 selected appear as visible. Once you reach your 10th experiment, the first one just dissapear from the plot - not from the visible tick from the workspace . See how en-hebrew green perf bar metric dissapears at some point.

https://github.com/iterative/vscode-dvc/assets/24204714/fa59f032-9366-455d-9ce9-12b0f6b64f7d

shcheklein commented 11 months ago

Thanks @SoyGema ! And if already have 10 selected and run a new one, would expect that to be automatically picked up / selected?

SoyGema commented 11 months ago

Yes , I would expect that. And it does in WANDB However it takes sometime until visualization appears in the plot .

Just launched one now : So what happens here is that it is super verbose first from the CLI They give me:

  1. Experiment name . fast-aardk-46
  2. Link to project
  3. Link to run ( I use this first just to check that the run worked correctly ) In your case, it might be seing the CLI dvc exp run command launched and running
Captura de pantalla 2023-12-21 a las 22 06 45

And then I go to the platform and the experiment appears as picked up / selected and in the TOP of the experiments column, even it was just launched. If I wait a while, the bar tracking metric appears

Captura de pantalla 2023-12-21 a las 21 56 47

TANGENTIAL but important : Please take into account the derivative/evolution for a DS of this , grouping . Normally we have a hypothesis that involves a set of experiments that we design, launch, and later come to analyze ( as a group -k-fold cross validation) What Im trying to say here, is that not only sequential experiment analysis is the main mental model for analyzing , and this is becoming common in my scenario. queue might work for you. What Im trying to say is that the experiments appear as a group in visualization