salesforce / p4-fusion

A fast Perforce to Git conversion tool written in C++ using Perforce Helix Core C++ API and Libgit2
BSD 3-Clause "New" or "Revised" License
78 stars 15 forks source link

Support view mappings for branches #54

Closed rpetti closed 1 year ago

rpetti commented 1 year ago

The docs show that this only works for single branches and even then, only branches containing single paths. Perforce allows branches to be located in any location, and even have different mappings for different branches.

Proposal: Allow the specification of a configuration file that describes the branches and the views used by them to sync:

[main]
//depot/main/config/my-component/... config/...
//depot/main/pkg/my-subcomponent-a/... pkg/my-subcomponent-a/...
//depot/main/pkg/my-subcomponent-b/... pkg/my-subcomponent-b/...

[dev]
//depot/comp/my-component/branches/dev/... ...
twarit-waikar commented 1 year ago

This is definitely the point where we consider how much of Perforce do we support and translate to Git.

Perhaps it may not be possible to support all of the features that Perforce provides like p4 streams and branch view mappings just because of the scope of the problem that p4-fusion is solving.

Do you have a specific use-case that I can know where you'd want to use this kind of a functionality?

p4-fusion is intended to be used across the entire VCS (or at least the most relevant sub directory) and that is why we initially only built it to support a single depot path to perform all of the conversion.

rpetti commented 1 year ago

The use case would be to do a conversion that preserves all branches and merge history for a project, irrespective of how the branches are structured in Perforce. Being able to specify the mapping between git branches and p4 depots seemed like the most straightforward way to accomplish this...

I can't find any other tool capable of doing a proper conversion in this manner. Git Fusion could do it, but Perforce doesn't make that available anymore.

If it's too difficult to add to this tool or you believe it is out of scope then I understand.

On Thu, Sep 15, 2022, 01:39 twarit-waikar @.***> wrote:

This is definitely the point where we consider how much of Perforce do we support and translate to Git.

Perhaps it may not be possible to support all of the features that Perforce provides like p4 streams and branch view mappings just because of the scope of the problem that p4-fusion is solving.

Do you have a specific use-case that I can know where you'd want to use this kind of a functionality?

p4-fusion is intended to be used across the entire VCS (or at least the most relevant sub directory) and that is why we initially only built it to support a single depot path to perform all of the conversion.

— Reply to this email directly, view it on GitHub https://github.com/salesforce/p4-fusion/issues/54#issuecomment-1247702989, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABSSRJ4QX5VLNLAY7WWUR3V6LHCXANCNFSM6AAAAAAQMRYZT4 . You are receiving this because you authored the thread.Message ID: @.***>

twarit-waikar commented 1 year ago

I understand the use-case. Ideally, p4-fusion would create branches in Git that would match to the ones in Perforce and then also create merge commits on the main branch when integration CLs are encountered in the history. I do believe that's the ideal that p4-fusion should be.

However, adding that kind of a functionality needs some extra research. That's not to say it's impossible, just that it needs an in-depth understanding of the branch and view mapping in Perforce to be able to convert that into Git

groboclown commented 1 year ago

If we limit the investigation to the Perforce streaming model (branches in the form //depot/branchname), then some simplifying assumptions can be made:

  1. Each "branchname" is its own branch (branch maps can mean that each branch map is its own map, and the stream branch model is one specific form of that, but overlapping branch maps would be tricky and potentially won't do what the end-user wants in all cases);
  2. Each Change list only affect a single branch (with classic depots, where change lists can spread across multiple locations, a single changelist could be broken into one change per branch).

This now leaves us inspecting the per-file action type for branch, move/add, move/delete, integrate logic (I'm leaving out import and archive which aren't exactly branching commands). I'm referencing the action types from the fstat command.

Simple integration is the equivalent of a Git cherry picked merge. There isn't much complication here. It would be complicated if we want a formal merge history, as that would require inspecting the formal merge file list and ensuring it matches up with the integration changelist's file list.

Integration with edit is more difficult. It's a Git cherry picked merge with a conflict edit (my knowledge of the Git API is limited).

In most cases, the move/add and move/delete pair should be only in a single branch, but there's nothing stopping it from spreading out. This logic is much more complicated. If done cross-branch, it's the Git equivalent of merge -> delete.

groboclown commented 1 year ago

More thoughts on this issue:

groboclown commented 1 year ago

I'm putting a proof-of-concept for this together.

The out-of-scope scenarios so far:

groboclown commented 1 year ago

Another tricky scenario. This is the kicker that will slow everything down.

Dealing with this scenario (which is a must) means storing a mapping between changelists and Git commit hashes.

groboclown commented 1 year ago

The issue where Perforce had an integration from a not-head revision makes this an incredibly difficult problem. In most cases in the history of the depot, we'll be dealing with a "yes, it's a branch" situation of:

In this scenario, CL 3 is a valid merge, but inspecting the log history of //a/main/...@3 shows that b.txt was merged from CL 1 and c.txt was merged from CL 2. They don't have the same source changelist.

In the case of a pre-history merge:

In this situation, CL 4 is merging from an older revision than the state of the "dev" branch when the CL 4 happened.

My current approach to deal with these situations is a combination of discovery then action.

groboclown commented 1 year ago

My branching method seems to be mostly working:

https://github.com/groboclown/p4-fusion/tree/branch-support-poc

I'm tracking down a bug where a commit happens to the wrong branch, or possibly to multiple branches. It looks like a possible issue with my reference handling, or maybe it's just a real bug.

groboclown commented 1 year ago

Got it working! PR incoming.

I know my comments on this issue have turned into a developer blog. I hope I haven't been too noisy with posting this.