salesforce / p4-fusion

A fast Perforce to Git conversion tool written in C++ using Perforce Helix Core C++ API and Libgit2
BSD 3-Clause "New" or "Revised" License
78 stars 15 forks source link

Cannot Migrate a Stream Branch #35

Closed EricAtORS closed 2 years ago

EricAtORS commented 2 years ago

We use stream branches in our development. I created a virtual branch to narrow down the scope of what I want to export to git. I ran in to a number of issues, which I would normally open one for each, but you may need the whole story for them to all make sense. So sorry for the long story / issue.

Issue 1 - Mappings don't work properly

So I ran something like this after creating my new virtual branch:

p4-fusion --client "eyen_p4_fusion" \
    --includeBinaries true  \
    --path "//origin/main/..." \
    --port "perforce.theobjects.com:1666" \
    --printBatch 10 \
    --src /work/git \
    --user user \
    --lookAhead 100

but then I get an error like this:

[ PRINT @ Main:31 ] Running p4-fusion from: /work/p4-fusion/build/p4-fusion/p4-fusion
[ SUCCESS @ InitializeLibraries:144 ] Initialized P4Libraries successfully
[ PRINT @ Main:80 ] Updated client workspace view eyen_p4_fusion with 26 mappings
[ ERROR @ Main:92 ] The depot path specified is not under the p4_fusion_test client spec. Consider changing the client spec so that it does. Exiting.

I know it's under the spec. I just created the virtual branch and workspace.

So I just removed the return statement here and then I got further.

But before going further, I tracked down the problem to this: According to the docs

 Translate() is designed to map single files. 
To model the effect of passing a broader path through a mapping, create a new one-sided mapping that represents that path and Join() it with the other mapping.

In my workspace mapping, I had this:

//origin/main/parent_folder/folder1 /work/space/folder1/...
//origin/main/parent_folder/folder2 /work/space/folder2/...

But then when I try to map the --path argument //origin/main/..., Translate() is not able to determine that the path is valid.

So that's the crux of the issue and I don't know enough of the perforce api to know how to check properly. I can ask them for support, but I suppose you are more knowledgeable than I am.

Issue 2 - It crashes with malloc(): unaligned tcache chunk detected

After removing that return statement, I then get a crash here: image These are the parameters printed at the beginning

[ PRINT @ Main:134 ] Perforce Port: ---                                                                  
[ PRINT @ Main:135 ] Perforce User: --                                                                                                                 
[ PRINT @ Main:136 ] Perforce Client: ---                                                                                                     
[ PRINT @ Main:137 ] Depot Path: //origin/main/...                                                                                                       
[ PRINT @ Main:138 ] Network Threads: 48                                                                                                                     
[ PRINT @ Main:139 ] Print Batch: 10                                                                                                                         
[ PRINT @ Main:140 ] Look Ahead: 100                                                                                                                         
[ PRINT @ Main:141 ] Max Retries: 10                                                                                                                         
[ PRINT @ Main:142 ] Max Changes: -1                                                                                                                         
[ PRINT @ Main:143 ] Refresh Threshold: 100                                                                                                                  
[ PRINT @ Main:144 ] Fsync Enable: 0                                                                                                                         
[ PRINT @ Main:145 ] Include Binaries: 1                                                                                                                     
[ PRINT @ Main:146 ] Profiling: 0                                                                                                                            
[ PRINT @ Main:147 ] Profiling Flush Rate: 1000                             

I didn't know what the tcache bug meant so I ran it with ASAN and the code runs now. I don't know what this bug means. Maybe I should force it to use less network threads, but it ran.

So now I have my git repo, but this brings me to issue 3.

Issue 3: Workspace mappings are not respected

In my workspace mappings, I have mappings like this:

//origin/main/parent_folder/folder2 /work/space/folder2/...

but in my git repo, what I see when I do clone is this: git_folder/parent_folder/folder2 instead of git_folder/folder

Ideally, I want the p4-fusion to Trasnlate() the mapping to the local folder and then strip the workspace root and then append the stripped path to the git repo. I don't know where to go for there.

twarit-waikar commented 2 years ago

I think your explanation for issue 1 there is spot on. Now that I read the documentation for Translate() again, it does look like the way we are using Translate() is not really intended.

That could also be the reason for issue 2 and issue 3. I will take a look at how Translate() is supposed to be used according to the note you have added above.

twarit-waikar commented 2 years ago

@EricAtORS I was able to setup up a local Perforce stream depot and run p4-fusion against it.

Issue 1 - Mappings don't work properly

This will be fixed by #47

Issue 2 - It crashes with malloc(): unaligned tcache chunk detected

This seems to an issue in the Helix Core C++ API you used. Since this issue was opened, we have migrated to using 2022.1 version of the API and we also implemented a fix that might make the API crash unexpectedly in #37. I was not able to reproduce this crash on my x64 macOS Monterey.

Even if this problem still exists, I don't think we can do anything about it right now since it seems to be an API bug.

Issue 3: Workspace mappings are not respected

Yes, this is not a bug but actually a use-case that p4-fusion is catering to. p4-fusion intentionally doesn't use the user side mappings of the client spec (basically the right side of the clientspec view mappings or the workspace root) for any path conversions because we expect p4-fusion to be run by Perforce admins and the client spec is just there to control the scope of the conversion. However, this doesn't mean p4-fusion requires Perforce admin rights.

It is a rare use-case that we haven't seen before. Normal Perforce users wouldn't (read: shouldn't) perform the conversion to Git themselves, and so p4-fusion ignores any workspace data that is specific to that user. This means we don't read the workspace root, nor the right side view mappings from the client spec.

We do however read the left side view mappings so that the p4-fusion user can selectively clone specific parts of the code, or selectively exclude parts of the code at the same time. This is the only reason why we added support for reading the client spec during conversion.

twarit-waikar commented 2 years ago

@EricAtORS If it is possible, I would like to know if you'd be interested in testing the fix for this issue. It seems to work fine on my end

EricAtORS commented 2 years ago

@twarit-waikar Thanks for the update! I am interested in testing it, but I am currently at a conference and I will then be on holiday until September.

I can test it then. So you can close it now and I can reopen later if there are more issues or I can close it after I test it.

twarit-waikar commented 2 years ago

No issues! Merging the PR for now as completed.