Closed dberenbaum closed 1 year ago
Im preparing also an issue regarding to branches 😋
Close to a duplicate of #456. I think we only need one of these issues.
I don't think we should be constrained by existing CLI flags here like --all-branches
etc. VS Code might be able to provide way more flexibility to pick and choose specific refs from the git tree within the UI. I think this type of functionality could be broadly useful not just in this feature, but in other features like removing unwanted experiments or data.
Is there a plan to show more than 2 previous commits to vsc-extension experiments table anytime soon?
@daveraghav yes! just as a matter of getting feedback - how many commits you would prefer to see? do you need to see branches?
Thanks for the update @shcheklein. Ideally the more the better. Either using 'Load More Commits' or something similar as mentioned @sroy3 or by making it configurable in extension settings. Also showing experiments from other branches would be great as multiple data scientists often work across parallel branches on the same project.
Also showing experiments from other branches would be great as multiple data scientists often work across parallel branches on the same project.
Thanks, yes. That's exactly why I'm thinking about allowing people to pick and choose in some way what they want to see (vs trying to present all commits and all branches - too expensive, and too much info). I'm thinking what would be the best way to find /add commits to show. We can introduce a control like this:
By default the ribbon would always have the "current branch" + 2 commits in it.
I guess the idea is to expand in the experiments table, and then those will be automatically shown as options in the plots view?
My take regarding the table:
- Dropdown to pick a branch, tag, commit sha
I think picking from branches would be enough to start and less overwhelming for users.
2392 Displays the previous two commits. Next step could be have a "show more commits" at the bottom of the table. It could add two (we can easily fine tune that to the number we want) more commits until there are no more to show.
Agree with @sroy3 that "show more" is simpler and more intuitive over having a configurable number of commits per branch/item.
Plus to this story. I'm confused on why now I only see 2 previous commits and experiments on them, without any option to show the other.
@lainisourgod we are working on this! are you interested in seeing other branches? any thoughts about the interface for this?
I'm new to all experiments thing so my workflow can be a little bit incorrect to philosophy of dvc.\
However, what I'd really want is to be able to choose from all experiments and to filter to see only some of them. I have a bunch of metrics on every run, and sometimes want to show in one table a set of experiments (e.g. baseline
, added feature 1
, added feature 2
, added feature 1 and 2
) so I can easily compare them.
In this case, experiments from other branches are can be useful too, because I can add different features in different branches.
Also I'd love to have a somewhat powerful search menu to see both commit messages, hashes and exp names.
Also, little aside from that, it'd be nice to reproduce dvc exp diff
functionality in Experiments view. E.g. set some experiment as a baseline, and for other experiments to show diffs in metrics and params instead of an absolute value
DVC has an option to show all branches dvc exp show --all-branches
or dvc exp show -a
. When using this flag, it is impossible to set the number of commits to show at the same time. Here is the display of dvc exp show -n 3 -a
:
The simplest way to include the branches in the experiments table (first step) would be to have a toggle to switch between regular commits view vs. branches view. Here is a quick test of how this could look:
The "Switch to Branch View" button will remove the "Previous Commits" rows, run dvc exp show -a
underneath. "Show More Commits" and "Show Less Commits" would be disabled and "Switch to Branch View" would be replaced by "Switch to Commits View". We can probably experiment a little with the style and placement (the previous image was created as a visual aid more than a mockup). Would that work for everyone as a first step?
DVC has an option to show all branches dvc exp show --all-branches or dvc exp show -a. When using this flag, it is impossible to set the number of commits to show at the same time. Here is the display of dvc exp show -n 3 -a:
yes, we need some DVC support for this. Most likely an option to pass an arbitrary commits (revs) + -n
to get their history. That is general enough to support all the cases I think. Let's create a ticket on the DVC side and/or contribute if needed. cc @dberenbaum .
Would that work for everyone as a first step?
Could you clarify which branch it would be switching to? how can we show multiple branches? Does it mean that we run two dvc exp show
commands?
I think eventually we want this to be single table with multiple sections per branch / tags, etc. It's needed to being able to compare things and plot things together from different branches.
yes, we need some DVC support for this. Most likely an option to pass an arbitrary commits (revs) +
-n
to get their history. That is general enough to support all the cases I think. Let's create a ticket on the DVC side and/or contribute if needed. cc @dberenbaum .
As discussed, I hope this is not a large effort, but I wonder especially with the results soon being cached if it's better to make separate calls to exp show
as a first step. WDYT?
Edit: It's not just about saving dev time on DVC but also about waiting to figure out what UI we really want.
I think it's clear that we want to see one table with multiple commits that belong to different branches pretty much (I can't come up with an alternative to this, but I would happy to hear thoughts). No matter if we show some history or not, etc - I think it doesn't change much - we'll need some support from DVC, and most likely it will in some form similar to what I described. I don't see a large risk implementing that since I hope it's not a large effort and we can change params a bit if needed as we go.
if it's better to make separate calls to exp show as a first step. WDYT?
It's definitely an extra effort that we'll need to replace later for sure (every additional command is more fragility + more overhead + lock contentions in some cases, etc, etc). How much? I don't know @sroy3 @mattseddon any thoughts?
I guess what I'm wondering is whether it's easier/better to have one giant JSON with everything vs. independent JSON for different branches/commits that could be partially updated, shown in separate tables, etc.
Good question, Dave. From my past observations/experience (and that was my assumption) - it's always better to minimize the number of commands that we run (less fragile, faster, etc) + I think implementing these flexible, partial updates, or UI that supports that can also be on a different level of complexity. Again, would love to hear other opinions / thoughts on this.
Would that work for everyone as a first step?
Could you clarify which branch it would be switching to? how can we show multiple branches? Does it mean that we run two
dvc exp show
commands?
It's actually all branches. Just like the DVC output I've posted.
It's actually all branches. Just like the DVC output I've posted.
I see. I think it can become very expensive and noisy tbh. I think we need to have a way to pick a brach(es) and show them. Implementation-wise if it's not a huge effort we can go with multiple commands, eventually migrate to a single one.
It's actually all branches. Just like the DVC output I've posted.
I see. I think it can become very expensive and noisy tbh. I think we need to have a way to pick a brach(es) and show them. Implementation-wise if it's not a huge effort we can go with multiple commands, eventually migrate to a single one.
There are currently no commands to show only one branch (unless I've misread the docs). I don't think it'd be that expensive. There aren't usually that many branches in a project. There could potentially be an almost infinite number of commits, but the number of branches should stay relatively low.
Okay, it can simplify a bit the initial step (no need a mechanism to pick a branch). We still want to show multiple commits per branch + we'll need almost as a next step a way to hide branches that are not relevant (I have repos with 10+ branches, in ML people use branches in some cases extensively for experimentation, etc). So, I'm not sure we are saving anything - it's just a bit different approach, but we'll end up in the same place - ability to pick which branches / tags to show + some history in some cases + show it within a single table.
There are currently no commands to show only one branch (unless I've misread the docs).
The --rev
option can be used to show only experiments derived from one branch. For example dvc exp show --rev try-large-dataset
in https://github.com/iterative/example-get-started
@sroy3 Just a note to please not worry too much about what dvc exp show
does. If you have other ideas for UI/UX, I'm sure we can either build it into dvc exp show
or find some creative solution to show the right rows.
@sroy3 Just a note to please not worry too much about what dvc exp show does. If you have other ideas for UI/UX, I'm sure we can either build it into dvc exp show or find some creative solution to show the right rows.
Especially with https://github.com/iterative/dvc/pull/9170 getting integrated soon.
Branches view was released with 0.7.1. Here is what it looks like: https://www.loom.com/share/50307b14f4c04f429ee5570f1dcc93ee
Currently working on adding individual branches with previous commits.
With https://github.com/iterative/dvc/issues/9390 being closed, we'll be able to use one dvc call instead of one per branch.
Since that PR touches a lot of the data, I'll wait until this gets merged before implementing the multiple branches in a single call (https://github.com/iterative/dvc/pull/9391#issuecomment-1531799956).
I'll also do any follow-ups linked to user experience first, as the new call is just a convenience for us and not visible to the user.
Some follow-ups that I currently have on my list are:
There will probably be more after the review, or feel free to add more.
In the Plots and Experiments views, is it possible to compare the workspace or experiments to past commits or branches? Right now, it seems limited to experiments based on the current commit. For example, in plots it would be nice to be able to enter any git ref here: