google / copybara

Copybara: A tool for transforming and moving code between repositories.
Apache License 2.0
2.13k stars 258 forks source link

Initial import of existing repo and ITERATIVE mode usage question #265

Open AlexTrotsenko opened 11 months ago

AlexTrotsenko commented 11 months ago

We are currently migrating some existing repos to the new monorepo structure while keeping access to some of them accessible for the wider audience.

In order to keep all the git history of the individual "public" repo we have used "ITERATIVE" mode for initial migration to the "private" monorepo.

Afterwards we have set-up workflow with "ITERATIVE" mode for private->public (as advised in the https://github.com/google/copybara/issues/24#issuecomment-324925392 )

However since we use bitbucket and gitlab we can't use CHANGE_REQUEST mode - so our initial idea was to keep "ITERATIVE" mode for "public" -> "private" workflow (same as it was during the initial migration).

And we have discovered an interesting behaviour:

The same time If we use "SQUASH" for "public" - > "private" workflow - then copybara is able to detect empty change with any number of commits.

interestingly enough, I have noticed, that if we enable metadata.squash_notes() and commit new change to "private" - these commits are actually also present in the "squash" commit details along with new change commit.

May I possibly ask you for the intended use/set-up of copybara for the case when public repo should be "imported" at first to monorepo and repos are not github repos?

Perhaps I am missing something and copybara can use GitOrigin-RevId of the "public" commit once they are migrated back to "public" and just skip it ?

AlexTrotsenko commented 11 months ago

Just in case - my current idea is to:

  1. run "public" -> "private" 1st migration "manually" with --init-history and workflow in "ITERATIVE" mode.
  2. set-up CI to run "public" -> "private" workflow in "SQUASH" mode.
  3. add 1st new commit to "private" and then run "private" -> "public" migration "manually" in "ITERATIVE" mode and point --last-rev to the latest revision, which was migrated from public in step 1.
  4. Set-up CI to run "private" -> "public" workflow in "ITERATIVE" mode.

Can you please advise if it's the right way of doing the set-up ? Is there perhaps more efficient approach?

mikelalcon commented 11 months ago

One important thing about Copybara is that supports a single source of truth for a set of files (roots in glob). That means that the case you described above is not supported intentionally. You need to use CHANGE_REQUEST for one of the directions. It shoudl be the one importing "PRs". Note that despite not supporting gitlab/bitbucket, as long as the incoming pending change has a ref, you can use that ref. E.g. refs/foo/bar/1234. When we created copybara, we didn't have gerrit-github/origin-destinations, we just had git.origin/destination. The reason we added support for specific review systems was these two:

copybara path/to/copy.bara.sky workflow refs/foo/bar/baz/1234

AlexTrotsenko commented 11 months ago

@mikelalcon thanks for the information regarding the intended use of copybara.

I have checked and indeed CHANGE_REQUEST mode generated clean squash_notes - so it will be a way to go for us as well.

Also can you please expected workflow with private repo as SoT? So we set-up ITERATIVE from private to public, right? But what should be the set-up from public to private? E.g. external user opens PR in the public repo with branch with name like feature_x pointing to the main.

If I understood it right - we should have another copybara CHANGE_REQUEST workflow from public to private.

But which branch should be used there ? E.g. shall we create feature_x branch in private repo with the same "copied" commits (from public PR) and merged it to main branch in private repo? What happens to that PR in the public repo? Should it be declined or?