Open GustavoVale opened 7 years ago
@fheck Do you have some idea?
With the following line I can get the commits I want "wget -O - -q https://api.github.com/repos/_:user_/_:repository_/pulls/_:numberPR_/commits | grep sha"
Hi @fheck your changes did not solve the problem.
In line 176 (} while (c != null && ! c.equals(base));) I had to add "!" because I got a loop.
I am interested in get the commits on GitHub. Using the GitHub API (command in a previous comment) I am able to get the right commits. Could you insert a command to get it? I think this is the easiest way.
Hi guys!
I remember something about the commits returned by the API need to be filtered. Like, there is a kind of hierarchy and @GustavoVale needs only the top-level ones.
The grep from https://github.com/se-passau/GitHubWrapper/issues/6#issuecomment-304858726 prints all commits instead. Can you verify this, @GustavoVale?
Hi,
To be honest, I did not get what exactly that method should get because in some cases it got commits that are not part of the merge. I guess getMergeBase() is not getting the right base. For instance, in the project https://github.com/GustavoVale/Merge_Conflict_Scenarios almost always the merge base is the commit a40dbf9b9c5b65c32074d3b0d131dd811c619c5a.
I looked into it some more, and I get consistent behavior with the current version 3bffcb9 (and since I did not touch any related code since 05027d1 this still seems as the likely culprit for the original bug to me) .
For example, if I run
List<Optional<List<Commit>>> prcommits = repo.getPullRequests(State.ANY).get().stream().map(PullRequest::getCommits).collect(Collectors.toList())
on your example repo, I get
prcommits = {java.util.ArrayList@2149} size = 6
0 = {java.util.Optional@2153} "Optional[[1601adc1d57e24641b032311e39e2009ba09fde1]]"
1 = {java.util.Optional@2154} "Optional[[c03ac19b42f2d6054c1de53ee3b4c78fb60f7425]]"
2 = {java.util.Optional@2155} "Optional[[2511c7816fa183b44f6bf506e7bac936000f43cf]]"
3 = {java.util.Optional@2156} "Optional[[0b68a2c0719dbb4ea235adc1e52804d8cd051351]]"
4 = {java.util.Optional@2157} "Optional[[fec4a7878e4c33a90a0462c0662151b7eae38c5d]]"
5 = {java.util.Optional@2158} "Optional[[a40dbf9b9c5b65c32074d3b0d131dd811c619c5a, 46b7d342cc4c18d679cce16703064ee02acbfd0d]]"
which is, I believe after looking at the PR page for the repo, exactly what is supposed to be returned.
If this still does not work for you, I can support an additional method providing the data directly from GitHub. but I would prefer the local version.
I had made many tests in this commit you mentioned (05027d1aa9f8b79a5762e60eb8b1aa4f9ecd2c54) and it did not work, but now it is working on that project.
Just to be sure, could you test with cotonet project (https://github.com/riselabs-ufba/cotonet). I am getting a different result than the expected. The expected result is:
0 = {java.util.Optional@2153} "Optional[[1d8d85617ea0657b13f98719c6275c94e1f497ed, 468deabd7bb45b2dd56d22104e1980d1d04cc173, ca30c02586827ea3d5131ffb96e083dc49d7f1ca, e49696e6d784e8038d74ee97db3761f7f2e365ed, 3df81622365ddf36666400b0d2a7ce7723ef7095, 6df8b64fe5e81bdebe5cf06745a4787cb605d601, 8558b7079bbcb3ef5e6889b1a9febc3db95cd110, 13d75e2c6fb2df1bf8e4fc9da602ec6d2f59a418, 690402f32feaa410371f616ba83dc1f8372f9411]]" 1 = {java.util.Optional@2154} "Optional[[7aef670ff851cc6c6e5ce81272837ead3f2a221c, 419b2b7a66c741226538d6194d5d4b60520d15c0]]" 2 = {java.util.Optional@2155} "Optional[[40507ea0897d58917efa4dd7575f0a659a0143a0, e2ac51a5ff62855177125b7701af8859f1fa78a2]]" 3 = {java.util.Optional@2156} "Optional[[e69bf7d8dec6aa70fde02a23b209d3edb6ab1375, 833347733957783dc546a9ddf61548c42905b9d8, 1a7c4e2bffac9b6e368943463943104e0b1b5f8c, 57d309e3898238845861a8341bdfc8eca910e9a5, b593ebe5b142bd807164d1b49d130e49599ab1d0, 5fb00c797f9e1851e05b642b47406a43afe89170, adfc4177ce60ed38bdaa6234ecc0b550a2347610, 995656243931fce8ca46ade383063c79601c9357, 174dcba5998b53cf50a5891d487b3207c2669c92, 22f6ce834f41df2b396e69e100cb0391f70f0a86, 0b1be907f12f89ca450b7cf2fa98218abc92a316]]" 4 = {java.util.Optional@2157} "Optional[[7b12b635468852f77e63063a667d473127a7cad6, 1c683de937971810f997bfbdc870bf25a14f33b3, d5541784fe05756f32b572a70d132e20335dc123, 61f59c1a36aaec245951c0fd3093469018d7d040]]" 5 = {java.util.Optional@2158} "Optional[[ e69bf7d8dec6aa70fde02a23b209d3edb6ab1375, 170049a52192f7b7216894b5c2787e9aa06c9355, 4b314c4897a5e551aebcaa081c17321aa201e960, 5f1acc1c2cbf1514c02321fac6262114fa795eda, 6e6e2fa728b9a945ee6ac8505d8db28954993c35, 9b3e79343ce67547911427eac280d5ba5f48bbd3, d778f66adc3de9d3c57079ef83563f7aa3cd679b, 5e35611c3ed55a6d2e18fff56b29eada6e67e497]]"
The differences are in the position 3 and 4. The result I got miss commits in position 3 (I am getting just 1 commit) and has much more commits in position 4 (to be exactly 1026 commits, it should be just 4 commits). I mean position as the position in the array presented, but these positions refer to pull-requests 2 and 3.
@fheck Did you see my email?
Ok, @GustavoVale, your example did reveal another problem, which is - I think - the actual root of the bug. (Side note: I have implement the alternative data source for the commit list, which I will publish, once I have tested it fully.)
From a git graph perspective, the data provided by GitHub is wrong.
So I would like additional input, as - from my point of view - this bug may introduce latent faulty data. (@xai)
I'm using cotonet issue 3 as my example.
Looking at the data provided by GitHub, the head of the PR is 0b1be90
, the merge commit (merge_commit_sha
, which should hold the actual merge commits sha1 after the merge) is also at 0b1be90
, but should be at 148e43a
according to my local copy of the repo.
Now, basing my assumptions about anything relating to the internal git graph structure on this wrong merge, obviously leads to faulty data. In this case, the discrepancy can be mitigated, by only using the commit list provided by GitHub, but this might - even in this case (deleted branches come to mind and I have already seen other inconsistent behavior) - be not enough and will ultimately provide wrong data to the user of the library (PullRequest#getMerge()
).
I can try to reconstruct the correct merge commit id from looking at the graph, but this is at least partially guessing. On the other hand, I do not understand, why GitHub behaves this way, did I misunderstand something or is this a special case, which I have to account for?
Thanks for summarizing this situation in detail! This is ... interesting indeed. I'll look into this tomorrow. (Copying @GSeibt here as well)
@fheck, Thanks for solve the problem.
Apparently everything is working. I also made tests in other systems. I expected that the commits presented by GitHub could be different to the git tree. The only difference to the API to the GitHubWrapper will be with opened pull-requests, but such commits should not be found. One example is pull 7 (https://github.com/GustavoVale/Merge_Conflict_Scenarios/pull/7).
No, the data from GitHub should never contradict the local data, for all git related intents and purposes, GitHub is only a convenience (by providing an GUI/web interface) with a (separate but coupled) issue tracker. All branch/commit information is and must be reconstructable from only the local repository. That is why I want to use data from GitHub only when necessary and to provide additional information.
This method works for now, but for this PR PullRequest#getMerge()
will return wrong data!
That is why I will keep this open for now, as it is most likely not fixable on our end (or will require a workaround as detailed in my previous post).
Thanks for answer.
However, once there the pull-request is opened the commit is not local, right?. In this case we are talking about the wrong data is nothing, right?
Well, I try to pull from the fork so I have the commits locally (under a different branch/remote). If there is no fork repo (see #11) this will obviously fail and, in turn, even the newly implemented data source (the GitHub list) will fail in this case, as I need to verify the existence of those commits locally to not brake referential integrity with git and the behavior from GitWrapper and other internals.
But you are correct in that the GitHub bug described in my earlier post is not applicable to unmerged PRs, as for them PullRequest#getMerge()
returns an empty Optional
since there is no merge.
It is already solved. Commit bb57ed86b843e8e947e6f91530a98285a77b131b and previous solved this issue and #11
I want to keep this open, as its root is actually is not solved/solvable for me. I will rename it to a more appropriate title for the root cause, but I would like to keep it open for now to remind me (and other users) of this Issue.
I am trying to get the commits of pull-request 4 (https://github.com/GustavoVale/Merge_Conflict_Scenarios/pull/4/commits), but instead of get two commits (46b7d34 and a40dbf9) I am getting a list with thousands of commits (6817 to be precise, but most of them are repeated). I guess that it is happening because this is the first closed pull-request of the project and it is getting a wrong "base" commit. The correct base commit is (4076cc11574f99ab067470b02f0d6cda24a0cb30) the third commit of the returned list.