Open shcheklein opened 1 year ago
did we change the logic behind the naming recently
Name duplications (the pair of name, baseline_sha
is not allowed to collide), started to be allowed when we introduced the human-readable names.
dvc exp show -A doesn't show all commits in this case (can be orphaned commits and experiments in this case - I'm doing a lot of rebases, etc).
There is not easy way to disambiguate those from the listing (I don't know how to remove a specific one under a specific commit).|
TBH, I was not aware that orphan experiments were displayed in exp list
Apart from this issue (which I think makes sense to show the sha), we are deprecating exp gc
so maybe we need something like exp remove --orphan
.
Name duplications (the pair of
name, baseline_sha
is not allowed to collide), started to be allowed when we introduced the human-readable names.
I think exp names should be unique (see #9361). It solves a lot of these problems and I don't see much downside. Is it important to be able to give two experiments with different baselines the same name?
Apart from this issue (which I think makes sense to show the sha), we are deprecating
exp gc
so maybe we need something likeexp remove --orphan
.
It could also be part of exp clean
. I would rather avoid having to explain the whole concept of orphaned refs if we can.
We should always everywhere show a Git hash for an experiment.
From a quick look at the code, it appears to be simple for the local repo, but not sure how easy it is to do for remote repos.
It's not straightforward to determine what experiments are "orphaned". By definition, exp commits are never orphaned since there is at least one git ref pointing to the commit (i.e. our exp ref). We could walk backwards from every git tag and branch head to get a list of commits and then clean exps that derive from other commits, but there could be other non-branch/tag refs pointing to those commits as well.
In particular, with remote git repos there could be things like github pull request or deleted branch refs pointing to those commits, in which case it is not clear whether or not we should be deleting exps that derive from those commits.
@pmrowla could we do the equivalent to exp gc with all options enabled? That would be quite conservative, right?
Yes we can do that, but gc --all-commits
has the same problem I was describing.
DVC's --all-commits
is not actually all non-orphaned commits in the git repo. It's just commits that are reachable from git branch heads and tags (which again does not include commits that may be reachable from things like github PR's or backups of deleted github branches). --all-commits
is good enough for the majority of users in the local use case, but may not be what we want when we get into remote repos and exps that have been shared via exp push
If we think this gives enough reason to keep dvc exp gc
, I'm fine to reverse and keep it.
I hit this issue with duplicated experiments (don't know yet why, any ideas? did we change the logic behind the naming recently @daavoo @dberenbaum ?):
dvc exp show -A
doesn't show all commits in this case (can be orphaned commits and experiments in this case - I'm doing a lot of rebases, etc).There is not easy way to disambiguate those from the listing (I don't know how to remove a specific one under a specific commit).
We should always everywhere show a Git hash for an experiment. Name should serve as tag, or as a human readable name, etc. Collisions should be fine.