treeverse / lakeFS

lakeFS - Data version control for your data lake | Git for data
https://docs.lakefs.io
Apache License 2.0
4.44k stars 353 forks source link

Separate "fast forward" and empty merges #3274

Open arielshaqed opened 2 years ago

arielshaqed commented 2 years ago

This case is currently mishandled, even after #3270.

Say we develop the exact same 5 files on two branches, but with different orderings:

    gitGraph
       commit id:"1111"
       commit id:"2222"
       branch develop
       checkout develop
       commit id:"+a,b,c"
       commit id:"+d,e"
       checkout main
       commit id:"+a,c,e"
       commit id:"+b,d"
       merge develop

Here, branch develop first added a,b,c, and then d,e, while branch main first added a,c,e, and then b,d. These are the exact same files, so there is no new metarange to write in the merge: either both branches ended up with the same ranges structures and therefore share a metarange ID, or the ranges structures are different and the merge can pick either one or create a third metarange ID.

But it is a new commit! This merge commit says that main and develop were the same, but it has different parents than both last commit IDs (+d,e on develop, +b,d on main). It's not even a fast-forward merge.

Current code (correctly!) identifies that no new metarange ID is needed, but then (incorrectly!) fails with an error and prevents the new merge commit from being created.

johnmantios commented 1 year ago

this seems a bit more complex than what I've taken up so far but I am willing to give it a try! Assign it to me please :)

arielshaqed commented 1 year ago
ozkatz commented 1 year ago

makes sense!

itaiad200 commented 1 year ago

With pleasure. @johnmantios here's some material for ramping up on lakeFS internals:

  1. Versioning internals
  2. Merge
  3. Glossary
github-actions[bot] commented 1 year ago

This issue is now marked as stale after 90 days of inactivity, and will be closed soon. To keep it, mark it with the "no stale" label.

arielshaqed commented 7 months ago

@johnmantios please let me know if you still want this issue, of course!

johnmantios commented 7 months ago

Hey @arielshaqed 👋 Unfortunately I cannot take this: my current job takes almost all of my time 😔