Open MaximilianTunk opened 1 month ago
@MaximilianTunk Great research! Since you have already dug this far, do you think you would be able to contribute a fix?
@dberenbaum Thank you :) I'm interested in contributing a fix, but i have to admit it'll be my first one ever to an open source github project. I have a really nice colleague that is here to help any time if I need help or have questions, so fixing it should'nt be any problem. If I have any questions or updates about possible solutions, I'll add a comment to this thread!
It looks like this issue is also present for dvc params diff --all
.
@MaximilianTunk Are you working on this? If you haven't gotten to it, I'd be happy to take a shot at it.
Thinking out loud about the implementation, it seems desirable that dvc.scm.iter_revs
returns a Mapping
with one entry per commit hash even if passed multiple references/revisions that point to the same commit hash. This seems desirable because there are a few places in the codebase where iteration over the result of Repo.brancher
is done (e.g., [1], [2]), and there's not much point in iterating over the same reference/revision multiple times, just to get the same result.
But because of that, it's kind of unclear what a good solution here is. My first thought would be to use some character that's not allowed in git references to join the revision names in dvc.repo.brancher
(e.g., :
instead of ,
), and then later split the dictionary keys on that character, if it's present in any of the keys. But that would have to be done in a few places in the codebase. Writing a wrapper around it seems clunky. Not sure.
Bug Report
Description
Hello, we found that
dvc metrics diff --all
outputs nothing, ifa_rev
andb_rev
refer to the same git commit. No matter if they are exactly the same or different types of references (HEAD vs branch_name, etc.)Reproduce
dvc metrics diff --all $(git rev-parse --abbrev-ref HEAD) HEAD
Expected
output metrics-diff table with all values with diff = 0.0
Environment information
Output of
dvc doctor
:Additional Information (if any):
Running the debugger, we noticed that metrics/diff.py:diff expects the results of metrics.show() to have the exact rev keys extracted. However metrics/show.py:show uses the brancher to extract the revs to use. However the brancher groups revs with the same sha and joins them.
This means that when we call
dvc metrics diff --all main main
the brancher would groupmain
andmain
and returnmain,main
. Hence therepo.metrics.show()
outputs all metrics with the keymain,main
and therepo.metrics.diff()
doesn't find results formain