Closed ian-h-chamberlain closed 8 months ago
Thank you for the thorough writeup. I agree, this could be handled much more smoothly.
Respect dateSince in this particular query. Even if set from the CLI or config file, this query still uses null which can result in a large response. I'm not sure if this might impact user experience if some commits got omitted because of using dateSince but I think that's sort of expected
I was not able to reproduce this. I run this command, and I see the commits are being filtered as expected:
backport --author sqren --repo elastic/kibana --maxNumber 100 --date-since "10 Oct 2023"
Can you try re-running the above command and see if you get all commits or only commits after 10th of October.
Ah, thanks for the example, that helps me reproduce it more easily without using private repos. The filtering via the command you mentioned seems to work correctly, but that wasn't the original problem I ran into.
The issue seems to be occur when there is a merge conflict while backporting, so I was able to trigger by backporting a recent PR of yours to an older branch:
$ backport --dryRun --author sqren --repo elastic/kibana --maxNumber 100 --date-since "10 Oct 2023" --pr 167828
? Select pull request [APM] Add API test for multiple APM Servers (#167828)
? Select branch 5.6
✔ 100% Cloning repository from github.com (one-time operation)
Backporting to 5.6:
✔ Pulling latest changes
✖ Cherry-picking: [APM] Add API test for multiple APM Servers (#167828)
An unhandled error occurred while backporting commit. Please see the logs for details: /Users/ichamberlain/.backport/backport.info.log
Here's the last few lines of the full log, which I think includes the query that's requesting too much data:
{"level":"info","message":"POST https://api.github.com/graphql (name:CommitsByAuthor, status: 502, EXCEPTION THROWN)","metadata":{},"timestamp":"2023-10-16 13:33:12"}
{"level":"error","message":"runSequentially failed Request failed with status code 502 (Github API v4)","metadata":{"githubResponse":{"config":{"adapter":["xhr","http"],"data":"{\"query\":\"query CommitsByAuthor($authorId: ID, $commitPath: String, $dateSince: GitTimestamp, $dateUntil: GitTimestamp, $maxNumber: Int!, $repoName: String!, $repoOwner: String!, $sourceBranch: String!) {\\n repository(owner: $repoOwner, name: $repoName) {\\n ref(qualifiedName: $sourceBranch) {\\n target {\\n ... on Commit {\\n history(\\n first: $maxNumber\\n author: {id: $authorId}\\n path: $commitPath\\n since: $dateSince\\n until: $dateUntil\\n ) {\\n edges {\\n node {\\n ...SourceCommitWithTargetPullRequestFragment\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n}\\n\\nfragment SourceCommitWithTargetPullRequestFragment on Commit {\\n repository {\\n name\\n owner {\\n login\\n }\\n }\\n sha: oid\\n message\\n committedDate\\n author {\\n name\\n email\\n }\\n associatedPullRequests(first: 1) {\\n edges {\\n node {\\n title\\n url\\n number\\n labels(first: 50) {\\n nodes {\\n name\\n }\\n }\\n baseRefName\\n mergeCommit {\\n ...RemoteConfigHistoryFragment\\n sha: oid\\n message\\n }\\n timelineItems(last: 20, itemTypes: CROSS_REFERENCED_EVENT) {\\n edges {\\n node {\\n ... on CrossReferencedEvent {\\n targetPullRequest: source {\\n __typename\\n ... on PullRequest {\\n targetMergeCommit: mergeCommit {\\n sha: oid\\n message\\n }\\n repository {\\n name\\n owner {\\n login\\n }\\n }\\n url\\n title\\n state\\n baseRefName\\n number\\n commits(first: 20) {\\n edges {\\n node {\\n targetCommit: commit {\\n message\\n sha: oid\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n}\\n\\nfragment RemoteConfigHistoryFragment on Commit {\\n remoteConfigHistory: history(first: 1, path: \\\".backportrc.json\\\") {\\n edges {\\n remoteConfig: node {\\n committedDate\\n file(path: \\\".backportrc.json\\\") {\\n ... on TreeEntry {\\n object {\\n ... on Blob {\\n text\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n}\",\"variables\":{\"repoOwner\":\"elastic\",\"repoName\":\"kibana\",\"sourceBranch\":\"main\",\"maxNumber\":100,\"authorId\":null,\"commitPath\":null,\"dateSince\":null,\"dateUntil\":\"2023-10-13T08:24:23Z\"}}","env":{},"headers":{"Accept":"application/json, text/plain, */*","Accept-Encoding":"gzip, compress, deflate, br","Authorization":"bearer <REDACTED>","Content-Length":"3034","Content-Type":"application/json","User-Agent":"axios/1.5.1"},"maxBodyLength":-1,"maxContentLength":-1,"method":"post","timeout":0,"transformRequest":[null],"transformResponse":[null],"transitional":{"clarifyTimeoutError":false,"forcedJSONParsing":true,"silentJSONParsing":true},"url":"https://api.github.com/graphql","xsrfCookieName":"XSRF-TOKEN","xsrfHeaderName":"X-XSRF-TOKEN"},"data":{"data":null,"errors":[{"message":"Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `C318:5862:1A4A0BC:35D9657:652D73C9` when reporting this issue."}]},"headers":{"access-control-allow-origin":"*","access-control-expose-headers":"ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset","content-type":"application/json","date":"Mon, 16 Oct 2023 17:33:12 GMT","server":"GitHub.com","transfer-encoding":"chunked","vary":"Accept-Encoding, Accept, X-Requested-With","x-github-request-id":"C318:5862:1A4A0BC:35D9657:652D73C9"},"status":502,"statusText":"Bad Gateway"},"name":"GithubV4Exception","stack":"GithubV4Exception: Request failed with status code 502 (Github API v4)"},"timestamp":"2023-10-16 13:33:12"}
{"level":"info","message":"Results","metadata":{"0":{"error":{"githubResponse":{"config":{"adapter":["xhr","http"],"data":"{\"query\":\"query CommitsByAuthor($authorId: ID, $commitPath: String, $dateSince: GitTimestamp, $dateUntil: GitTimestamp, $maxNumber: Int!, $repoName: String!, $repoOwner: String!, $sourceBranch: String!) {\\n repository(owner: $repoOwner, name: $repoName) {\\n ref(qualifiedName: $sourceBranch) {\\n target {\\n ... on Commit {\\n history(\\n first: $maxNumber\\n author: {id: $authorId}\\n path: $commitPath\\n since: $dateSince\\n until: $dateUntil\\n ) {\\n edges {\\n node {\\n ...SourceCommitWithTargetPullRequestFragment\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n}\\n\\nfragment SourceCommitWithTargetPullRequestFragment on Commit {\\n repository {\\n name\\n owner {\\n login\\n }\\n }\\n sha: oid\\n message\\n committedDate\\n author {\\n name\\n email\\n }\\n associatedPullRequests(first: 1) {\\n edges {\\n node {\\n title\\n url\\n number\\n labels(first: 50) {\\n nodes {\\n name\\n }\\n }\\n baseRefName\\n mergeCommit {\\n ...RemoteConfigHistoryFragment\\n sha: oid\\n message\\n }\\n timelineItems(last: 20, itemTypes: CROSS_REFERENCED_EVENT) {\\n edges {\\n node {\\n ... on CrossReferencedEvent {\\n targetPullRequest: source {\\n __typename\\n ... on PullRequest {\\n targetMergeCommit: mergeCommit {\\n sha: oid\\n message\\n }\\n repository {\\n name\\n owner {\\n login\\n }\\n }\\n url\\n title\\n state\\n baseRefName\\n number\\n commits(first: 20) {\\n edges {\\n node {\\n targetCommit: commit {\\n message\\n sha: oid\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n}\\n\\nfragment RemoteConfigHistoryFragment on Commit {\\n remoteConfigHistory: history(first: 1, path: \\\".backportrc.json\\\") {\\n edges {\\n remoteConfig: node {\\n committedDate\\n file(path: \\\".backportrc.json\\\") {\\n ... on TreeEntry {\\n object {\\n ... on Blob {\\n text\\n }\\n }\\n }\\n }\\n }\\n }\\n }\\n}\",\"variables\":{\"repoOwner\":\"elastic\",\"repoName\":\"kibana\",\"sourceBranch\":\"main\",\"maxNumber\":100,\"authorId\":null,\"commitPath\":null,\"dateSince\":null,\"dateUntil\":\"2023-10-13T08:24:23Z\"}}","env":{},"headers":{"Accept":"application/json, text/plain, */*","Accept-Encoding":"gzip, compress, deflate, br","Authorization":"bearer <REDACTED>","Content-Length":"3034","Content-Type":"application/json","User-Agent":"axios/1.5.1"},"maxBodyLength":-1,"maxContentLength":-1,"method":"post","timeout":0,"transformRequest":[null],"transformResponse":[null],"transitional":{"clarifyTimeoutError":false,"forcedJSONParsing":true,"silentJSONParsing":true},"url":"https://api.github.com/graphql","xsrfCookieName":"XSRF-TOKEN","xsrfHeaderName":"X-XSRF-TOKEN"},"data":{"data":null,"errors":[{"message":"Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `C318:5862:1A4A0BC:35D9657:652D73C9` when reporting this issue."}]},"headers":{"access-control-allow-origin":"*","access-control-expose-headers":"ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset","content-type":"application/json","date":"Mon, 16 Oct 2023 17:33:12 GMT","server":"GitHub.com","transfer-encoding":"chunked","vary":"Accept-Encoding, Accept, X-Requested-With","x-github-request-id":"C318:5862:1A4A0BC:35D9657:652D73C9"},"status":502,"statusText":"Bad Gateway"},"name":"GithubV4Exception"},"status":"unhandled-error","targetBranch":"5.6"}},"timestamp":"2023-10-16 13:33:12"}
Setting --maxNumber 20
seems to reduce the size of the query enough for it to work, but the precise number probably depends on the server load, size of the query contents, etc.
Hopefully that helps with the details — if you need more info, let me know and I can provide full logs from a different computer (assuming I can reproduce there as well). Thanks!
I've published some improvements in 9.4.1. Would be great if you can give it a try, thanks!
Thanks for the quick fix! It looks like your changes helped, I can no longer reproduce with the original command that I first ran into problems on.
You are welcome. FYI: I added this:
When encountering a conflict the tool automatically retrieves related commits to figure out if one of them caused the conflict. In this case it was using the same maxCommit
value as specified via the CLI. I've changed it to a hardcoded value of 50
I was trying this tool out for the first time and I seem to have found a case that can pretty easily trigger a timeout from Github's servers when querying for a conflict. Unfortunately it's on a private repo so I can't share all the details, but I've managed to track down what I think is the cause. Here is the problematic query, from the logs, which I think is only used when there is a merge conflict:
and the response:
From reading a few sources online, it seems like this is mostly caused by the response taking too long (> 10s) and Github servers timing out. There might be a few things that could help address this:
dateSince
in this particular query. Even if set from the CLI or config file, this query still usesnull
which can result in a large response. I'm not sure if this might impact user experience if some commits got omitted because of usingdateSince
but I think that's sort of expectedmaxNumber
or add date filters? It took me a bit to figure out the root cause of the problem at all