treeverse / lakeFS

lakeFS - Data version control for your data lake | Git for data
https://docs.lakefs.io
Apache License 2.0
4.46k stars 359 forks source link

Add support for hidden branches #8375

Closed N-o-Z closed 2 days ago

N-o-Z commented 1 week ago

Closes #8355

Change Description

Background

This came as a discussion regarding the use of ephemeral branches. For example for the use of transactions (Python SDK)

New Feature

Add support for creating hidden branches

Hidden branches by default not shown on branch listing Added a new flag for branch listing to show all branches (disabled by default) Hidden branches can still be fetched using the GetBranch API

Testing Details

Add unit testing for graveler and controller

Breaking Change?

No

github-actions[bot] commented 1 week ago

:recycle: PR Preview 4a48b437cc4f18a639f196b6015d88aaa9c70647 has been successfully destroyed since this PR has been closed.

🤖 By surge-preview

github-actions[bot] commented 1 week ago

E2E Test Results - DynamoDB Local - Local Block Adapter

13 passed
github-actions[bot] commented 1 week ago

E2E Test Results - Quickstart

11 passed
N-o-Z commented 6 days ago

Wow, thanks! This make isolation on lakeFS so much better!

Small things:

  • I would prefer to mark this "experimental". While I am quite sure that we will want to keep it, I can imagine the API changing.
  • Check annotation in catalog_test.go suggests we need to run gomock or something.
  • Not sure about controller tests.

Larger issue: I would like us to re-think how we handle "special-purpose" metadata.

I imagine we will end up with user metadata for branches. Indeed I can think of several purposes for this already. I think I would prefer to have general metadata for branches -- key-value, essentially. If we had then, we could declare that we continue to own the ::lakefs::* namespace (same as we do on objects) -- and then a branch is hidden if it has ::lakefs::hidden set.

I do like the new interface for ListBranches. If (as an alternative) we implemented ListBranches as "list branches with this general filter language for branch metadata", then hidden branches would not really be hidden unless you asked for that. So I do not want to do that!

Thanks for the review. Not so sure about the metadata approach. User metadata is one thing, but defining branch properties as internal metadata is something I think we should be careful with - it's much more prone to errors IMO. We might want to also refrain from coupling features which don't necessarily have close relation and especially if they don't exist yet (nor on the roadmap).