unfoldingWord / dcs-branch-merger

A package to merge branches with the DCS API.
https://unfoldingword.github.io/dcs-branch-merger/
MIT License
0 stars 0 forks source link

Add optional filename argument to the check functions #5

Open mandolyte opened 1 year ago

mandolyte commented 1 year ago

If filename is present then here are some notes on a solution.

Swagger notes: This API: https://qa.door43.org/api/swagger#/repository/repoGetPullRequestFiles

used like this: https://qa.door43.org/api/v1/repos/unfoldingword/en_tn/pulls/3189/files

returns this:

[
  {
    "filename": "tn_TIT.tsv",
    "status": "changed",
    "additions": 7,
    "deletions": 13,
    "changes": 20,
    "html_url": "https://qa.door43.org/unfoldingWord/en_tn/src/commit/82ba83a294f96a5380e55ba828546304cff5486b/tn_TIT.tsv",
    "contents_url": "https://qa.door43.org/api/v1/repos/unfoldingWord/en_tn/contents/tn_TIT.tsv?ref=82ba83a294f96a5380e55ba828546304cff5486b",
    "raw_url": "https://qa.door43.org/unfoldingWord/en_tn/raw/commit/82ba83a294f96a5380e55ba828546304cff5486b/tn_TIT.tsv"
  }
]

It shows:

mandolyte commented 1 year ago

If the filename provided matches one in the returned array, then continue as if this were the only file in the branch to consider.

NOTE: this may only be useful for the check for updates from master.

mandolyte commented 1 year ago

@richmahn I think I have found one problem, namely:

The UI has the "update branch by merge" button. But there doesn't seem to be anyway to know what it will update in my branch if I click it.

I suppose this makes sense, but it blocks me from achieving the purpose of this issue, which is to inform the user about a specific file, the one they are working on. Especially for branches named by bookid, whether any other books in their branch are older is not of interest.

richmahn commented 1 year ago

@mandolyte I assume we'll have to then make a PR of master into user to get that info. Only way I see.

mandolyte commented 1 year ago

We have an approach that avoids the need for second PR to be created. See discussion here.

Will document more in a later post here.

mandolyte commented 1 year ago

First, take note of this work by @richmahn:

Compare file shas: https://replit.com/@richmahn/Finds-Base-SHA-and-If-File-Changed-in-Master#index.js

Second, here is an outline of the idea:

Thinking...

  • suppose we look at the base SHA and the merge base SHA
  • if they are different, then that means master has had commits since the branch was started
  • then if get the SHA for a given file from base and merge base and they are different, then that means the users file has had commits since branch started

Could this approach be used to avoid making a second PR? (unfortunately I don't see a way to get the SHA for a single file in swagger)

Third, the JSON returned by the PR is helpful. Quoting from the discord chat:

The PR we create today has the SHA for base, merge_base, and head. Do we need any more than that? (sorry, it is in the JSON we get back)

So this is the essential idea. My next post will be to work thru this by hand using the Gitea API

mandolyte commented 1 year ago

Case 1. branch-behind

NOTE! if the check function says there are conflicts, do not continue. Conflicts must be handled manually.

In this case, the branch is behind master and can be updated without conflict.

Here is the base info:

base: 
    label:"master"
    ref:"master"
    repo:{...}
    repo_id:72626
    sha:"2d137737c84b93c4336d1de2937f89269fba9e93"

The sha value being the thing of interest.

Here is the head info:

head: 
    label:"branch-behind"
    ref:"branch-behind"
    repo:{...}
    repo_id:72626
    sha:"3a02625ab1a7c9dd821eb5692805961ecae7ccab"

Here is the merge_base info:

merge_base:"3a02625ab1a7c9dd821eb5692805961ecae7ccab"

In this case, merge_base is not equal to base: This means that master has been updated since the user branch was created. This is the key indicator that the file being worked on might have updates that can be applied to the user branch copy of the file. And since there are no conflicts those updates can be safely incorporated.

So let's get the file sha values from current master and merge base and compare them.

To do this we use the "contents" API. https://qa.door43.org/api/swagger#/repository/repoGetContents and we will use tn_TIT.tsv as the file to test

First the merge base (which will be the file as it was when user branch (head) was created, with sha 3a02625ab1a7c9dd821eb5692805961ecae7ccab.

{
  "name": "tn_TIT.tsv",
  "path": "tn_TIT.tsv",
  "sha": "cc6fa82bc04d50ec9c38258976a58011db764403",
  "type": "file",
  "size": 43175,
}

Second the current master:

{
  "name": "tn_TIT.tsv",
  "path": "tn_TIT.tsv",
  "sha": "78c58a858c8d2d7e01ce65d04e210d2e21da39e4",
  "type": "file",
  "size": 43176,
}

Observe that the file has changed:

Therefore, the copy in user's branch can be updated and the updates will not cause a conflict.

mandolyte commented 1 year ago

Case 2: branch-behind-and-ahead

NOTE! if the check function says there are conflicts, do not continue. Conflicts must be handled manually.

In this case:

The URL for base contents: https://git.door43.org/api/v1/repos/dcs-poc-org/en_tn/contents/tn_TIT.tsv?ref=2d137737c84b93c4336d1de2937f89269fba9e93

returns:

{
  "name": "tn_TIT.tsv",
  "path": "tn_TIT.tsv",
  "sha": "78c58a858c8d2d7e01ce65d04e210d2e21da39e4",
  "type": "file"
}

The URL for merge base contents: https://git.door43.org/api/v1/repos/dcs-poc-org/en_tn/contents/tn_TIT.tsv?ref=3a02625ab1a7c9dd821eb5692805961ecae7ccab

returns:

{
  "name": "tn_TIT.tsv",
  "path": "tn_TIT.tsv",
  "sha": "cc6fa82bc04d50ec9c38258976a58011db764403",
  "type": "file"
}

Therefore, like Case 1, the copy in the user's branch can be updated.

In this same branch, consider Zephaniah instead of Titus.

{
  "name": "tn_ZEP.tsv",
  "path": "tn_ZEP.tsv",
  "sha": "c2cd8b9932014a6780714e1e39729af84df5f84e",
  "type": "file",
} 

vs.

{
  "name": "tn_ZEP.tsv",
  "path": "tn_ZEP.tsv",
  "sha": "c2cd8b9932014a6780714e1e39729af84df5f84e",
  "type": "file",
}

The file has not changed. So if Zephaniah had been the book of interest, we would not need to update it.

richmahn commented 1 year ago

@mandolyte Unless I'm missing something, for Case 2, you state "the copy in user's branch can be updated without conflict.", you don't know this. It could be that the update master has for the file is on lines that the user has edited, as this branch is also "ahead".

For branch-behind, you do know it can be updated without conflict as the user branch commit sha is still the same same as the merge_base commit sha.

mandolyte commented 1 year ago

@richmahn that's true... I'll change the wording. I'm guessing that if the conflict property returned is true, they shouldn't go any further and check on file updates.

Is there a way to tell which files have conflicts from the PR?

richmahn commented 1 year ago

@mandolyte We just happened to be talking about this feature today, and I came here to find I guess it wasn't implemented as it is still open, and now saw I never saw your comment to me.