Closed hazemakhalek closed 1 month ago
Revert and then applying back is an option to test. It is advisable to check the size of the whole repo in doing so, to make sure the repo does not explode in size. Alternatively, we can do some force push and emerge the PRs we added. This second option shall be avoided if possible as it is higher risk and is likely to be more difficult as PRs after the merge do have a different history. The path stated above is the first good attempt.
I'd recommend to test it in a clean fork and test the change in size of the whole repo. As the merge contains binaries, it is good to check its impact.
@ekatef also mentioned her availability. I'm a bit stuck till the weekend unfortunately, happy to support after in case
Hey! You can also keep @siddharth-krishna in the loop. He is our git pro ;) Reverting the merge commit can still be quite challenging in case other commits have been added. Hard pushing to the main branch should definitely be avoided imho.
Hey! You can also keep @siddharth-krishna in the loop. He is our git pro ;) Reverting the merge commit can still be quite challenging in case other commits have been added. Hard pushing to the main branch should definitely be avoided imho.
Great @siddharth-krishna , any feedback would be welcome. Comments above. As a summary, we would like to replace a past squash and merge into a simple merge.
For now options are:
As mentioned above, I wouldn't recommend force push either, but worth sharing all the options probably.
@siddharth-krishna , what do you think?
Thanks for tagging me, I'm happy to take a look and see if I can help. :)
Thanks for the summary, Davide, I agree with your (and others') recommendation against force push, as this makes life difficult for the many currently open PRs and anyone's unpushed local branches.
Reverting the squashed commit a8987468
is not so straightforward right now, since there have been commits on the main branch since a8987468
. But it's not impossible. Davide is right about the size though, the revert commit will be as big as the original commit, and if we then do a merge commit of the merge-pyspa-earth-sec
branch we'll effectively be adding all that code a 3rd time, so the size of the repo does go up by 2X the size of the pypsa-earth-sec code! (I'm not sure how big all this is, but we can try to calculate this.)
However, perhaps there's a 3rd option:
We can add a merge commit to the main
branch that merges merge-pypsa-earth-sec
branch into main. This would mostly be a redundant commit, as the code changes have already been included in main via a8987468
. However, this would have the benefit of not re-adding all that code, since merge commits are very small (they only contain info on how to resolve any conflicts). It also brings the entire commit history of pypsa-earth-sec back into the main branch:
Caveat: unfortunately, since we don't remove the squashed merge, if we ask for the commit history of any of the files from pypsa-earth-sec, we only see the squashed merge commit:
% git log --oneline Makefile
a8987468 Merge pull request #1086 from merge-pyspa-earth-sec
But we can add a line to the README that if your history only shows that commit, you can use the following command to instead show the history of the file from the merge-pyspa-earth-sec
branch:
% git log --oneline origin/merge-pyspa-earth-sec -- Makefile
1480d7a4 Merge remote-tracking branch 'pypsa-earth-sec/snakemake-module' into merge-pyspa-earth-sec
94ba3703 test: use more core
f229301e ci: use one core to better track log
...
dc4f4a2d Test routine: add Makefile CI: consolidate ci.yaml files
To summarize, Option 3: add a second (redundant) merge commit to main. Advantages:
a8987468
Disadvantages:
Thank you @siddharth-krishna for your suggestions. I'll review them tomorrow or Friday.
In the meantime, it's important to highlight that the implemented solution must meet two critical requirements:
1) Preserve the full commit history of the PyPSA-Earth-Sec repository: This reflects nearly three years of development and is crucial for transparency and future reference.
2) Ensure proper credit to all contributors of the PyPSA-Earth-Sec model: Every team member's work over the past three years must be accurately represented.
This is especially important because we will soon abandon the PyPSA-Earth-Sec repository (hopefully for good). So let's take the necessary time to solve this issue carefully ;)
Sounds good, thanks.
Here is a slight variant, Option 3a, that removes the disadvantage of Option 3 and shows file histories as expected:
Idea: first merge main
into the branch merge-pypsa-earth-sec
, in order to make that branch have the same contents as main
does currently. Then make a new PR from merge-pypsa-earth-sec
and rebase merge it in (fast-forward). This can be simulated in your local checkout by the following commands:
alias go='git checkout '
go merge-pyspa-earth-sec
git merge main -s ort -Xtheirs # Resolve conflicts using main's version, as main has new commits since a8987468
git diff main HEAD
# Empty output here shows that the contents are the same as in main
go main
git merge merge-pypsa-earth-sec
git diff origin/main HEAD
# Empty output here shows that the contents of main with Option 3a implemented are same as origin/main
The commit graph of main
now looks like:
And inspecting the history of files that came from pypsa-earth-sec shows their entire history as expected:
alias gl='git log -n 5 --pretty=format:"%Cred%h %Cgreen%an, %ar%Creset %s"'
gl Makefile
1480d7a4 Fabian, 8 weeks ago Merge remote-tracking branch 'pypsa-earth-sec/snakemake-module' into merge-pyspa-earth-sec
94ba3703 Fabian, 8 weeks ago test: use more core
f229301e Fabian, 8 weeks ago ci: use one core to better track log
23bc59fe Fabian, 9 weeks ago revert changes to config.default and config.tutorial priority
69817205 Fabian, 9 weeks ago workflow: use global root directory to avoid recursive upwards chdir
Advantages:
a8987468
Disadvantages:
Hi Sid,
This last suggestion actually sounds good. Very simple yet very smart. I believe this should cover the 2 main requirements: the first one is clear but I think the second one too.
@davide-f, @ekatef, @FabianHofmann, any downsides you can foresee? Also, @energyLS, please scan this and let us know if you have any suggestions.
Do you have suggestions for the way forward? @siddharth-krishna
I've been experimenting on the git reset usage, mainly out of curiosity. Potentially it is not that bad, see https://github.com/pypsa-meets-earth/pypsa-earth/compare/main..davide-f:pypsa-earth:test_reset_merge2 Note that despite it looks "nice" it shall be improved.
What I did was: git reset 3d6365ac6af69b86fcadb1de9809ed09986e654f (commit before the merge) (*** comment below) git merge merge-pypsa-earth-sec {Merged the branch of the merge} git cherry-pick {various commits after the merge event; note that I didn't merge the merge commits as they require some additional git stuff; to make things nice, we shall account for that and also preserve the order}
The result (luckily/surprisingly) does not look that bad actually. What do you think?
*** if we decide to go for this, here we should add the --hard option. I didn't do that here, otherwise main and this were not easily comparable with the link above
@davide-f I presume the option you are investigating requires us to force-push origin/main
to the reset-and-cherry-picked branch? That would work for sure, and results in a cleaner main
branch, but the downside is that it makes any open PR branches, and any local branches people may have, "out of sync".
This is not a hard problem. As long as we force-push main
to a new state that has the exact same file contents, it should be possible for people to rebase their branches back onto the force-pushed main
. But if many people in the community are not so familiar with git, or are not aware that we have done this, it could lead to some confusion, and would require some time from maintainers to guide everyone through the steps. It's an option worth considering, though.
Hi Sid,
This last suggestion actually sounds good. Very simple yet very smart. I believe this should cover the 2 main requirements: the first one is clear but I think the second one too.
@davide-f, @ekatef, @FabianHofmann, any downsides you can foresee? Also, @energyLS, please scan this and let us know if you have any suggestions.
Do you have suggestions for the way forward? @siddharth-krishna
@hazemakhalek thanks for keeping me in the loop. I have no further suggestions and would like to leave this to the git pros. I totally support the motivation behind it though.
To get a complete picture, have tested the initially suggested solution with reverting the merge commit and re-applying the merge properly. This exercise doesn't inflate size of the repo too much: disk usage by .git
increases by only ~0.1 MB, with the overall .git
size of about 30MB. Most likely, this small increase results from the fact that the merge commits themselfes are quite tiny, as @siddharth-krishna has explained. However, the solution Option 3b by @siddharth-krishna does basically the same with cleaner history.
Have tested Option 3b, and it looks like an excellent solution:
pypsa-sec-merge
branch [1668 commits (= 1669 - 1 auto-update of README) vs 1667 commits of the merge PR #1086 and 1589 commits of PyPSA-Earth-Sec]Given that GitHub itself seems not to have any issues with displaying contributions if Option 3b is used, contributions PyPSA-Earth-Sec should be represented in an accurate way.
@hazemakhalek may you have any ideas on which other metrics would be helpful to check?
@siddharth-krishna thank you so much for working on this! A great solution and really amazing insights into git power 🙂
I've been experimenting on the git reset usage, mainly out of curiosity. Potentially it is not that bad, see https://github.com/pypsa-meets-earth/pypsa-earth/compare/main..davide-f:pypsa-earth:test_reset_merge2 Note that despite it looks "nice" it shall be improved.
What I did was: git reset 3d6365a (commit before the merge) (*** comment below) git merge merge-pypsa-earth-sec {Merged the branch of the merge} git cherry-pick {various commits after the merge event; note that I didn't merge the merge commits as they require some additional git stuff; to make things nice, we shall account for that and also preserve the order}
The result (luckily/surprisingly) does not look that bad actually. What do you think?
*** if we decide to go for this, here we should add the --hard option. I didn't do that here, otherwise main and this were not easily comparable with the link above
@davide-f thank you so much for finding time to experiment with the fix. Agree that the result looks very nice, though cherry-picking definitely requires a lot of concentration, given the need to the force-push. In any case, that is fantastic to see that the problem definitely can be resolved! ❤️
It looks like all the solutions have git merge merge-pypsa-earth-sec
as the core incantation, so we seems to be converging 😄
Just looking there is only one possible downside of options 1 and 3*: line contributions do not match the past repo. As we squash meshed the commit anyone who contributed to the -sec now has obtained the same line contributions as the rest and this may not be well fixed by the additional merge PR. Needs to be checked.
It is easy to check the current status in https://github.com/pypsa-meets-earth/pypsa-earth/graphs/contributors When placing the contributions by additions, there is a huge spike due to the merge for many. Re-merging the whole commits may further lead to weird counting potentially. Fairness here is a point for both communities. Do we count commits only or addition/deletion?
How does it appear with the other options?
Do we want that? It seems conflicting to point 2 by hazem and may be conflicting also to -earth developer.
The force option may be tested in a parallel branch first. The risk of breaking existing PRs may not be big problem as except for the most recent ones, they have been forked before the merge, so no conflict is expected even in the event of force pushing. I believe the force option is unsafer, but we can keep copies to avoid damages
Regardless of the method, I'd propose to solve this by the end of the week to normalize the approach.
Current options are:
Can you express preferences/recommendations? @hazemakhalek @ekatef @FabianHofmann @siddharth-krishna
As we squash meshed the commit anyone who contributed to the -sec now has obtained the same line contributions as the rest
Yes, it looks like this is true. The squash merged commit a898746 has
119 files changed
+16423 -1817 lines changed
and git show
on it shows
commit a8987468ceda152ed1152f6c7bfa2ffb79da0837
Author: Fabian Hofmann <fab.hof@gmx.de>
Date: Thu Sep 19 23:17:23 2024 +0200
Merge pull request #1086 from merge-pyspa-earth-sec
So I think GitHub attributed that entire commit to the author of the PR, Fabian. This can be confirmed in https://github.com/pypsa-meets-earth/pypsa-earth/graphs/contributors
I created a test repo https://github.com/siddharth-krishna/test-contributions to test how GitHub's contributions are counted when PRs are squash merged vs merge merged. I tried to simulate what happened here: I made some dummy commits on main (I hope you'll excuse me, Fabian, for using your name for this test):
28c9bc9 Fabian Hofmann, 5 minutes ago A
0cfaa2e Siddharth Krishna, 8 minutes ago Initial commit
And this resulted in the following contributors graph:
I then made a branch sec-branch
that had some commits from both of us, but Fabian had 3 commits and 3 lines added, while I had 2 and 2:
6277065 Fabian Hofmann, 21 minutes ago F
9021486 Fabian Hofmann, 22 minutes ago E
166a50f Fabian Hofmann, 22 minutes ago D
a8a6f05 Siddharth Krishna, 22 minutes ago C
bbe1f9d Siddharth Krishna, 22 minutes ago B
28c9bc9 Fabian Hofmann, 27 minutes ago A
0cfaa2e Siddharth Krishna, 31 minutes ago Initial commit
I then opened a PR https://github.com/siddharth-krishna/test-contributions/pull/1, and did a squash merge. After the squashing, the contributors graph looked like:
I then implemented Option 3b by making a redundant merge commit on the sec-branch
and opening a new PR from the same branch https://github.com/siddharth-krishna/test-contributions/pull/2. I couldn't rebase merge this PR, not sure why, but I merge merged it in, resulting in the following git commit graph on the main branch:
And the contributors graph looked like this:
So it looks like GitHub attributes the squash merge as +1 commit to all authors involved, and also +L additions to all authors involved (where L is the number of lines added in the PR). If we then do Option 3b and make a redundant merge that brings in all the commits from the sec-branch, GitHub adds the correct number of commits and additions to each author.
On the plus side, this option brings the number of commits of all contributors back to almost what it should be (pypsa-earth-sec authors will have +1 extra commit, but perhaps that's an acceptable error). It also preserves the relative ranking between pypsa-earth-sec authors in terms of additions. But on the minus side, pyspa-earth-sec authors now have double counted their additions from the sec-branch, especially compared to pypsa-earth contributors who have not contributed to pypsa-earth-sec.
I suppose it all depends on which metric the contributors to both repos care about? And whether people would rather maintain an accurate "additions" metric, or force-push the main branch and potentially inconvenience people with open PRs and local branches. I am happy to support, no matter which option everyone ends up choosing. :)
(If I have time tomorrow I could also investigate the effect of Option 1 on the contributors graph, but my guess would be that a revert would not result in a decrease in "additions".)
I only now noticed the spike in line additions now. Thanks for pointing this out @davide-f
@siddharth-krishna, great and insightful work!
I personally only look at the number of commits. but I understand that this can be seen as unfair to the contributors of PyPSA-Earth.
As a side note: The additions counter for the main pypsa-earth-sec contributors (Leon and myself) is already inflated because we had a few csvs in the repo back in the very early stages of the repo in Q1 2022. If the additions count matter, maybe we can use something like git-filter-repo to remove the history of those csv files?
Lastly, I agree with Sid, I don't think this will automatically reduce the additions..
Great exploration @siddharth-krishna !!! <3 I believe we have all we need.
The summary is: options that involve re-merging the merge basically bring the number of commits to the number that should be but they don't fix the line changes (additions/deletions), possibly leading to (partial) double counting of some line of code.
I'd love to be able to see if with cherry-picking we could provide a better and cleaner approach without disrupting everything. Personally, I hope to find some time in the weekend for the test. Support is welcome.
I'd like to have an exchange with @hazem about it
By next week, let's fix it :D
I agree with everything above, but I had a few minutes this morning so for completeness (and my curiosity) I tested out what happens to contributions if we revert the squash merge:
Before the squash merge:
After squash merging https://github.com/siddharth-krishna/test-contributions/pull/3:
After Option 1 (revert squash merge and merge merge the branch again):
It looks like the revert commit counted as 1 commit and 3 deletions for me (Sid), and then the merge merge did the expected thing and got counted as 3 commits (2 on the branch and 1 merge commit) and 2 additions for me, and 1 commit and 1 addition for Fabian.
Conclusion: I think Option 1 does not help in terms of fixing commit or additions count, and might muddy the metrics further since it also adds a bunch of deletions to the person who makes the revert commit. So, as mentioned by Hazem and Davide our options remain:
I tested Option 2 in this PR: https://github.com/davide-f/pypsa-earth/tree/main that in terms of changes it is basically perfect: https://github.com/pypsa-meets-earth/pypsa-earth/compare/main..davide-f:pypsa-earth:main
Using the procedure as follows:
git reset 3d6365ac6af69b86fcadb1de9809ed09986e654f
git push --force
git clean -f
git fetch upstream
git status
git merge upstream/merge-pyspa-earth-sec
git cherry-pick 1777cb665ddb9d56d23aeb3f867a41ff1244742e
git cherry-pick 2350d55209bbeae09c6d019cfbf72d3e7989e80b
git cherry-pick d0fa6d6f3bccb18208c002a5d1d2353673e6a298
git cherry-pick 7b1249a20e6b56815329cc05ebf781e90b99d01f
git cherry-pick 7ef77254244e22982f4c7ba57b0fd63c7220101f
git push
git cherry-pick -m 1 ea99d748c1e7b7e6af9fd4ebad0554df70687e55
git commit --allow-empty
git cherry-pick 3142b1f71a1f73bfe753fd0107aa03c47e6740af
git cherry-pick ae4479fe3cb2405908d0962ff44de80e872873af
git cherry-pick 2cda6d768a28e79c8573be316a88e9a93c990242
git cherry-pick 3b8b952f978cf8f6c327284fc9f8e0f09c4e7d48
git cherry-pick 12edb982ef070470c08115a6ed752c0602dfa951
git cherry-pick 87d7f08f46faf9254ca8a7e31a01802a175a8a80
git cherry-pick 8fab463b47700523deefb679cca7eca75540e461
git cherry-pick a21ff8d77f2158503242c23a4cc803f8f30c046e
git push
git cherry-pick -m 1 328a32e7269853ed48c911a76e3282d5a76ed24d
git commit --allow-empty
git cherry-pick 356fb71eab053d62e9771a634c542391f5143524
git cherry-pick b64623d8d3d8127e186dbe3b9a899e326e1bcb72
git cherry-pick 75cb757f2a44464d94830812010dbd4e2c821e8c
git cherry-pick 2824af4fceff9a745e9d3ff1d02233422cd6b896
git cherry-pick 0a6f98c0104c859ba898f68a6230cf46ee401690
git cherry-pick 489ff8f9fb8b35cd4730910a23411da57ebd28e6
However, as a downside, the cherry-picks add the picker [me in this case] as co-author of each of the latest commits, though minimal.
I'll schedule a call to finalize this issue. If you're interested to join please react to this.
Today we met with @ekatef and @hazemakhalek and we are of common agreement that option 3b shall be the way to go. The reason being that it is extremely easier for the users and developers. Having better contribution list (additions/deletions) has not been considered a sufficient reason to compromise usability of the users.
Therefore, we shall go for option 3b. Katia has given availability to do so, as she also tested it locally. As a curiosity, have you pushed the result in a branch? It would be nice to see the result and then do it on main. If you need support, feel free to ask [no pressures] I'm wondering that it may be easy to create a new branch of the PR "merge" and remerge it here in github?
Today we met with @ekatef and @hazemakhalek and we are of common agreement that option 3b shall be the way to go. The reason being that it is extremely easier for the users and developers. Having better contribution list (additions/deletions) has not been considered a sufficient reason to compromise usability of the users.
Therefore, we shall go for option 3b. Katia has given availability to do so, as she also tested it locally. As a curiosity, have you pushed the result in a branch? It would be nice to see the result and then do it on main. If you need support, feel free to ask [no pressures] I'm wondering that it may be easy to create a new branch of the PR "merge" and remerge it here in github?
The result of application Option 3a are here in this comparison.
The new branch fix_squash_merge3
is a clone of main
to which I have applied the list of git commands proposed by @siddharth-krishna. A little additional adjustment will be needed to use main
version of add_extra_components.py
instead of merge-pypsa-earth-sec
one.
@ekatef thanks for trying it out! I took a look at your branch, and there might be 2 minor issues:
git merge main -s ort -Xtheirs
, there's a difference between main
and your branch on add_extra_components.py
... not sure why this is happening, but it's not a big deal, we can always do git merge main
and resolve the conflicts manually while ensuring that we always pick main
's version of each conflict.ekatef:fix_squash_merge3
and then create a PR to main
that is then merged with a merge commit, this doesn't fix the issue with file histories. Unfortunately, the merge commit created by GitHub when merging in a PR has the "main parent" as the main branch (which is normally what you want), so a git log
query after merging will only report the commits on the main branch (i.e. the squashed merge commit, and not the full history from the merge-pyspa-earth-sec
branch).This can be seen in my test repository https://github.com/siddharth-krishna/test-contributions, where I implemented Option 3b on a branch called sec-branch
and then merged it in with a merge commit:
* 1d4ab0c Siddharth Krishna, 5 days ago Merge pull request #2 from siddharth-krishna/sec-branch
|\
| * 872ff0a (origin/sec-branch, sec-branch) Siddharth Krishna, 5 days ago Merge branch 'main' into sec-branch
| |\
| |/
|/|
* | 7e1d4d9 Somebody Else, 5 days ago X
* | c006a44 Siddharth Krishna, 5 days ago Merge the sec branch (#1)
| * 6277065 Fabian Hofmann, 5 days ago F
| * 9021486 Fabian Hofmann, 5 days ago E
| * 166a50f Fabian Hofmann, 5 days ago D
| * a8a6f05 Siddharth Krishna, 5 days ago C
| * bbe1f9d Siddharth Krishna, 5 days ago B
|/
* 28c9bc9 Fabian Hofmann, 5 days ago A
* 0cfaa2e Siddharth Krishna, 5 days ago Initial commit
Unfortunately, if you are at commit 1d4ab0c
and you ask for the history of the README file (that all commits have modified), you see:
gl README.md
7e1d4d9 Somebody Else, 5 days ago X
c006a44 Siddharth Krishna, 5 days ago Merge the sec branch (#1)
28c9bc9 Fabian Hofmann, 5 days ago A
0cfaa2e Siddharth Krishna, 5 days ago Initial commit
Instead of the desired full history:
7e1d4d9 Somebody Else, 5 days ago X
c006a44 Siddharth Krishna, 5 days ago Merge the sec branch (#1)
6277065 Fabian Hofmann, 5 days ago F
9021486 Fabian Hofmann, 5 days ago E
166a50f Fabian Hofmann, 5 days ago D
a8a6f05 Siddharth Krishna, 5 days ago C
bbe1f9d Siddharth Krishna, 5 days ago B
28c9bc9 Fabian Hofmann, 5 days ago A
0cfaa2e Siddharth Krishna, 5 days ago Initial commit
On the other hand, if we checkout ekatef:fix_squash_merge3
and ask for the history of a file we do see the detailed changes from both main
and merge-pyspa-earth-sec
:
git checkout ekatef/fix_squash_merge3
git log --oneline .gitignore
7aded83c Fabian, 8 weeks ago solve_networks: fix network reference for sector coupled case
1480d7a4 Fabian, 10 weeks ago Merge remote-tracking branch 'pypsa-earth-sec/snakemake-module' into merge-pyspa-earth-sec
031ad55d Fabian, 3 months ago replace snakemake subworkflow by submodule; fix path relations
d6299207 Hazem-IEG, 11 months ago industrial database v1
9e3d7ca8 energyls, 1 year ago feat: gitignore cost data
9566c0da carlosfv, 1 year, 1 month ago Changes made: *git-ignore file has been updated to avoid all .geojson files *details of new parameters have been added to the configtables (clean_osm_data_options.csv) *the config.default and config.tutorial files have been renamed and comments have been formated to fit PEP 8 recommendations *clean_osm_data script has been renamed and now the description of the added function follows PEP8 format and matches other functions *the proposed function in clean_osm_data has now been tested for substations also and is now being used in substations and cables
97c2f2f0 Hazem-IEG, 1 year, 1 month ago add default values for EG and ignore slurm folder
174ae272 Hazem-IEG, 1 year, 2 months ago resolving PR comments
6d54321a carlosfv, 1 year, 2 months ago Changes made: *The gitignore file added the custom_line path in the data folder to be included recognized (custom _lines.geojson has some predefined values as an example) *the config.default file has been modified (lines 104) to include a flag named "use_custom_lines" to define if the model will use the default osm data (False) or the custom file (True) *the clean_osm_script has been adapted (line 855) to include a conditional (if), linked to the flag in the config file in order to use either the default osm data or the custom data file to create the base dataframe of lines (df_lines) *the config.tutorial file has been modified (lines 118) to include a flag named "use_custom_lines" to define if the model will use the default osm data (False) or the custom file (True) and run the tests for the PR *To run the model with custom data, a file has to be created and added in the data folder named "custom_lines.geojson"
86898d07 Hazem-IEG, 1 year, 2 months ago adding the excel file with all the paths
56c97afe Hazem-IEG, 1 year, 2 months ago remove unnecessay commands and files
7b44631e Davide Fioriti, 1 year, 3 months ago bugfix: Fix admin needs for windows
d0d709d5 Davide Fioriti, 1 year, 4 months ago Fix pre-commit and revise style (#779)
3ac1efdd Davide Fioriti, 1 year, 4 months ago Merge branch 'main' into fix_empty_data_countries
b015ab28 Davide Fioriti, 1 year, 4 months ago Fix summary (#764)
09fbece4 Max Parzen, 1 year, 5 months ago fix typo
71dd6006 Max Parzen, 1 year, 5 months ago add dask scheduler
Perhaps the way to implement Option 3b is to apply the commands on a branch like ekatef:fix_squash_merge3
, use the Compare page/git diff
to ensure that there are no differences to main
and then force push that branch to main
.
@ekatef thanks for trying it out! I took a look at your branch, and there might be 2 minor issues:
- As you point out, for some reason despite using
git merge main -s ort -Xtheirs
, there's a difference betweenmain
and your branch onadd_extra_components.py
... not sure why this is happening, but it's not a big deal, we can always dogit merge main
and resolve the conflicts manually while ensuring that we always pickmain
's version of each conflict.- If we apply Option 3a to a branch such as
ekatef:fix_squash_merge3
and then create a PR tomain
that is then merged with a merge commit, this doesn't fix the issue with file histories. Unfortunately, the merge commit created by GitHub when merging in a PR has the "main parent" as the main branch (which is normally what you want), so agit log
query after merging will only report the commits on the main branch (i.e. the squashed merge commit, and not the full history from themerge-pyspa-earth-sec
branch).This can be seen in my test repository https://github.com/siddharth-krishna/test-contributions, where I implemented Option 3b on a branch called
sec-branch
and then merged it in with a merge commit:* 1d4ab0c Siddharth Krishna, 5 days ago Merge pull request #2 from siddharth-krishna/sec-branch |\ | * 872ff0a (origin/sec-branch, sec-branch) Siddharth Krishna, 5 days ago Merge branch 'main' into sec-branch | |\ | |/ |/| * | 7e1d4d9 Somebody Else, 5 days ago X * | c006a44 Siddharth Krishna, 5 days ago Merge the sec branch (#1) | * 6277065 Fabian Hofmann, 5 days ago F | * 9021486 Fabian Hofmann, 5 days ago E | * 166a50f Fabian Hofmann, 5 days ago D | * a8a6f05 Siddharth Krishna, 5 days ago C | * bbe1f9d Siddharth Krishna, 5 days ago B |/ * 28c9bc9 Fabian Hofmann, 5 days ago A * 0cfaa2e Siddharth Krishna, 5 days ago Initial commit
Unfortunately, if you are at commit
1d4ab0c
and you ask for the history of the README file (that all commits have modified), you see:gl README.md 7e1d4d9 Somebody Else, 5 days ago X c006a44 Siddharth Krishna, 5 days ago Merge the sec branch (#1) 28c9bc9 Fabian Hofmann, 5 days ago A 0cfaa2e Siddharth Krishna, 5 days ago Initial commit
Instead of the desired full history:
7e1d4d9 Somebody Else, 5 days ago X c006a44 Siddharth Krishna, 5 days ago Merge the sec branch (#1) 6277065 Fabian Hofmann, 5 days ago F 9021486 Fabian Hofmann, 5 days ago E 166a50f Fabian Hofmann, 5 days ago D a8a6f05 Siddharth Krishna, 5 days ago C bbe1f9d Siddharth Krishna, 5 days ago B 28c9bc9 Fabian Hofmann, 5 days ago A 0cfaa2e Siddharth Krishna, 5 days ago Initial commit
On the other hand, if we checkout
ekatef:fix_squash_merge3
and ask for the history of a file we do see the detailed changes from bothmain
andmerge-pyspa-earth-sec
:git checkout ekatef/fix_squash_merge3 git log --oneline .gitignore 7aded83c Fabian, 8 weeks ago solve_networks: fix network reference for sector coupled case 1480d7a4 Fabian, 10 weeks ago Merge remote-tracking branch 'pypsa-earth-sec/snakemake-module' into merge-pyspa-earth-sec 031ad55d Fabian, 3 months ago replace snakemake subworkflow by submodule; fix path relations d6299207 Hazem-IEG, 11 months ago industrial database v1 9e3d7ca8 energyls, 1 year ago feat: gitignore cost data 9566c0da carlosfv, 1 year, 1 month ago Changes made: *git-ignore file has been updated to avoid all .geojson files *details of new parameters have been added to the configtables (clean_osm_data_options.csv) *the config.default and config.tutorial files have been renamed and comments have been formated to fit PEP 8 recommendations *clean_osm_data script has been renamed and now the description of the added function follows PEP8 format and matches other functions *the proposed function in clean_osm_data has now been tested for substations also and is now being used in substations and cables 97c2f2f0 Hazem-IEG, 1 year, 1 month ago add default values for EG and ignore slurm folder 174ae272 Hazem-IEG, 1 year, 2 months ago resolving PR comments 6d54321a carlosfv, 1 year, 2 months ago Changes made: *The gitignore file added the custom_line path in the data folder to be included recognized (custom _lines.geojson has some predefined values as an example) *the config.default file has been modified (lines 104) to include a flag named "use_custom_lines" to define if the model will use the default osm data (False) or the custom file (True) *the clean_osm_script has been adapted (line 855) to include a conditional (if), linked to the flag in the config file in order to use either the default osm data or the custom data file to create the base dataframe of lines (df_lines) *the config.tutorial file has been modified (lines 118) to include a flag named "use_custom_lines" to define if the model will use the default osm data (False) or the custom file (True) and run the tests for the PR *To run the model with custom data, a file has to be created and added in the data folder named "custom_lines.geojson" 86898d07 Hazem-IEG, 1 year, 2 months ago adding the excel file with all the paths 56c97afe Hazem-IEG, 1 year, 2 months ago remove unnecessay commands and files 7b44631e Davide Fioriti, 1 year, 3 months ago bugfix: Fix admin needs for windows d0d709d5 Davide Fioriti, 1 year, 4 months ago Fix pre-commit and revise style (#779) 3ac1efdd Davide Fioriti, 1 year, 4 months ago Merge branch 'main' into fix_empty_data_countries b015ab28 Davide Fioriti, 1 year, 4 months ago Fix summary (#764) 09fbece4 Max Parzen, 1 year, 5 months ago fix typo 71dd6006 Max Parzen, 1 year, 5 months ago add dask scheduler
Perhaps the way to implement Option 3b is to apply the commands on a branch like
ekatef:fix_squash_merge3
, use the Compare page/git diff
to ensure that there are no differences tomain
and then force push that branch tomain
.
Thanks a lot @siddharth-krishna!
Regarding the difference for add_extra_components.py
, I suspect that git
doesn't qualify the change as a conflict and doesn't see a need to apply -Xtheirs
. Agree though that it's rather a minor issue.
I have experimented with Option3a and can confirm that you are completely right on using GitHub pull requests. A pull request to upstream/main
looks promising (here, but testing in a fork reveals the pitfalls you have identified.
If I'm merging the identical pull request in my fork (branch merge-backup-isolated
), it seems to bring back the commits to the project history but not the history of the files:
When I'm applying Option 3b locally and push the result, the file history is brought back:
Not sure if we could find a way to change parameters of Github merge...
Agree that pushing local changes sound more realistically. I wonder if it could be a safer option, as compared with force-pushing the fix_merge
branch, to make changes directly in the local version of upstream/main
, and push only the merge commit which has been previously done locally.
Have restored the history in my origin/main
. The preparation steps mainly followed the algorithm of Option 3a
:
1) content of merge-pypsa-earth-sec
copied into a merge-pypsa-earth-sec-backup
;
2) main
merged into merge-pypsa-earth-sec-backup
;
3) a missed commit 75cb757f2a44464d94830812010dbd4e2c821e8c cherry-picked;
4) modified main pushed to my origin.
The result seems to be alright:
1) no differences with main in terms of files content (an empty output of git diff upstream/main HEAD
)
2) the history of commits is preserved, e.g. this commit from pypsa-earth-sec
is searchable by git log
3) history of files is alright (e.g. build_base_energy_totals.py which has been added by the merge).
Due to the issue with GitHub merge commit we discussed previously, the next step must be push of the fixed main to upstream. @davide-f is would be great to have your double-checking for that.
@ekatef thanks for the dry run, and checking that the histories are preserved. I took a look at your fork and the branch ekatef/main
looks good to me. I agree that the way to go would be to force push the commits on ekatef/main
to origin/main
. Also sounds like a good idea for someone else to double check everything. :)
🤞
Done and seems to work 🚀 Thanks a lot for support @davide-f and @siddharth-krishna ! It has been an exciting exercise, but hope to avoid the need to repeat those in the future 😄
@hazemakhalek do you think we can close the issue?
Everything looks fine, so yes I believe so. Thanks!
Thanks Everyone for your efforts! @siddharth-krishna @ekatef @davide-f
Preserve PyPSA-Earth-Sec history
The MERGE PR has been mistakenly squashed and merged and now the entire history of the pypsa-earth-sec repo is not shown in the main pypsa-earth repo.
Suggested solution:
Revert the merge
Remerge with preserving history
Reapply the following commits.
Important: Do not delete the merge branch