ptsochantaris / trailer

Managing Pull Requests and Issues For GitHub & GitHub Enterprise
https://github.com/ptsochantaris/trailer
Other
1.16k stars 67 forks source link

[1.8.x] Issues with Github background sync #476

Closed Jecoms closed 1 year ago

Jecoms commented 1 year ago

On upgrade to 1.8/1.8.1, I reset/reloaded all data as the syncing seemed to be broken.

Initially, this seemed to update as expected and the menu bar icons have correct styling/count.

A few minutes later, the github server label goes red and all menu bar icons switch to red Xs.

Screenshot 2023-07-06 at 10 37 08 AM Screenshot 2023-07-06 at 10 43 22 AM

katiawheeler commented 1 year ago

I am also having this issue since the upgrade. When I downgrade back to version 1.7.6 it's no longer an issue.

ptsochantaris commented 1 year ago

Hi, and thanks for reporting! This sounds to me like GH isn't liking the query style of this version on accounts with many items - can you try this build to see if it improves things for you?

Also, if possible, could you turn on logging from Misc and see if you can spot any issues via Console.app? If this is not a throttling issue there may be some other problem that the log could provide us with more info about.

[edit: removed, see below]

Thanks! Will be keeping an eye on this thread and try to fix this issue asap once we know more.

ptsochantaris commented 1 year ago

Whoops, scratch that, that build was totally broken, apologies - please give this a try instead:

[Edit: removed - version now updated]

Andreas409 commented 1 year ago

I think mine is working again on this latest build 👍

ptsochantaris commented 1 year ago

That's great, thanks for the feedback, if things still seem fine by this evening I'll do an update.

ptsochantaris commented 1 year ago

I have put up an update with this tweak now. I shortly expect to add an option to allow parallel v4 API requests (default off) so that, for users like me where this doesn't cause GH to throttle, we can turn it on. Sorry for the hiccup, I'll leave this open for a little while in case there are further issues.

Jecoms commented 1 year ago

After upgrade I am seeing the same behavior. I turned on the logging, but I'm not seeing any specific github/gql errors related to the data sync other than a log saying it failed. Logs included at the bottom.

I do have a lot of repos filtered by participation, so I went ahead and hid a bunch of them to be under 20 total (I started with 40-50). It seems related to the gql scope and maybe it's an n+1 query situation to get PR related data like assignees (There was a large block of logs saying "needs 1 more query").

This reduced set of repos works for my needs, so I'll stay on the latest version. I'm happy to retry with the larger set again with any new tweaks.

Appreciate the renewed time and attention to this project!

log examples:

default 10:00:14.462882-0500    Trailer Will pause and retry call to https://api.github.com/graphql
default 10:00:34.797739-0500    Trailer Failed call to https://api.github.com/graphql
default 10:00:34.804768-0500    Trailer Status update: Processing update…
default 10:00:34.805377-0500    Trailer Rolling back changes for failed sync on API server 'Github'
default 10:00:34.808521-0500    Trailer Nuked total 0 items marked for deletion
default 10:00:34.809954-0500    Trailer Committing synced data
default 10:00:34.810178-0500    Trailer Synced data committed
default 10:00:34.810239-0500    Trailer No DB changes
default 10:00:34.810874-0500    Trailer Postprocess done - 0.0006099939346313477 sec
default 10:00:34.813730-0500    Trailer No DB changes
default 10:00:34.813801-0500    Trailer Refresh done
default 10:00:34.814079-0500    Trailer Status update: Last update failed
default 10:00:34.876756-0500    Trailer order window: ab2a op: 0 relative: 0 related: 0
default 10:00:34.876954-0500    Trailer order window: ab2b op: 0 relative: 0 related: 0
default 10:00:34.877096-0500    Trailer order window: ab2c op: 0 relative: 0 related: 0
default 10:00:34.899146-0500    Trailer Updating general PullRequest menu, X total items
amayers commented 1 year ago

I only have 5 watched repositories. But I'm seeing the same issue & logs as @Jecoms is. Running 1.8.2 (1652).

ptsochantaris commented 1 year ago

Thanks for the update @Jecoms and @amayers - the Will pause and retry error does seem to imply a rate issue.

"1 more query" is fine, as basically it means that after processing some data, that data is signalling that more paging is needed.

If you turn on Dump API responses to the console in Misc you may be able to track which request is failing and what the error coming from the server is, it may provide us with more info.

amayers commented 1 year ago

@ptsochantaris I don't see any error messages from Github. In the servers tab it shows that my API limit is not even showing up on the bar. I don't have any other apps that are using the API, so I shouldn't be hitting the rate limit.

default 12:05:48.798306-0400    Trailer Will sync items from: <redacted>, <redacted>, <redacted>, <redacted>, <redacted>
default 12:05:48.801183-0400    Trailer order window: 23d5 op: 0 relative: 0 related: 0
default 12:05:48.805593-0400    Trailer Updating general PullRequest menu, X total items
default 12:05:48.807607-0400    Trailer order window: 23ea op: 1 relative: 23ea related: 0
default 12:05:48.813618-0400    Trailer (TQL 'GitHub: Open PRs') Fetching: fragment milestoneFragment on Milestone { __typename title } fragment repositoryFragment on Repository { __typename id pullRequests(first: 50, states: [OPEN]) { edges { node { __typename ... pullrequestFragment } cursor } pageInfo { hasNextPage } } } fragment labelFragment on Label { __typename id name color createdAt updatedAt } fragment pullrequestFragment on PullRequest { __typename id bodyText state createdAt updatedAt number title url milestone { __typename ... milestoneFragment } author { __typename ... userFragment ... botFragment } assignees(first: 20) { edges { node { __typename ... userFragment } cursor } pageInfo { hasNextPage } } labels(first: 20) { edges { node { __typename ... labelFragment } cursor } pageInfo { hasNextPage } } headRefOid mergeable additions deletions headRefName baseRefName isDraft mergedBy { __typename ... userFragment } baseRepository { __typename nameWithOwner } headRepository { __typename nameWithOwner } } fragment userFragment on<…>
default 12:05:48.813788-0400    Trailer (TQL 'GitHub: Open Issues') Fetching: fragment milestoneFragment on Milestone { __typename title } fragment botFragment on Bot { __typename id login avatarUrl } fragment issueFragment on Issue { __typename id bodyText state createdAt updatedAt number title url milestone { __typename ... milestoneFragment } author { __typename ... userFragment ... botFragment } assignees(first: 20) { edges { node { __typename ... userFragment } cursor } pageInfo { hasNextPage } } labels(first: 20) { edges { node { __typename ... labelFragment } cursor } pageInfo { hasNextPage } } } fragment labelFragment on Label { __typename id name color createdAt updatedAt } fragment repositoryFragment on Repository { __typename id issues(first: 50, states: [OPEN]) { edges { node { __typename ... issueFragment } cursor } pageInfo { hasNextPage } } } fragment userFragment on User { __typename id login avatarUrl } { nodes(ids: ["MDEwOlJlcG9zaXRvcnkzMTg4NDE1ODQ=","MDEwOlJlcG9zaXRvcnkzMjAxNjQ1NTk=","MDEwOlJlcG9zaXRvcnkzMDg3MDM1Mzk=","M<…>
default 12:05:48.813936-0400    Trailer (TQL 'Authored Items') Fetching: fragment milestoneFragment on Milestone { __typename title } fragment botFragment on Bot { __typename id login avatarUrl } fragment pullrequestFragment on PullRequest { __typename id bodyText state createdAt updatedAt number title url milestone { __typename ... milestoneFragment } author { __typename ... userFragment ... botFragment } assignees(first: 20) { edges { node { __typename ... userFragment } cursor } pageInfo { hasNextPage } } labels(first: 20) { edges { node { __typename ... labelFragment } cursor } pageInfo { hasNextPage } } headRefOid mergeable additions deletions headRefName baseRefName isDraft mergedBy { __typename ... userFragment } baseRepository { __typename nameWithOwner } headRepository { __typename nameWithOwner } repository { __typename ... repositoryFragment } } fragment labelFragment on Label { __typename id name color createdAt updatedAt } fragment repositoryFragment on Repository { __typename id createdAt updatedAt isFork isArchived nameWithO<…>
default 12:05:48.814378-0400    Trailer (TQL 'Authored Items') Fetching: fragment milestoneFragment on Milestone { __typename title } fragment issueFragment on Issue { __typename id bodyText state createdAt updatedAt number title url milestone { __typename ... milestoneFragment } author { __typename ... userFragment ... botFragment } assignees(first: 20) { edges { node { __typename ... userFragment } cursor } pageInfo { hasNextPage } } labels(first: 20) { edges { node { __typename ... labelFragment } cursor } pageInfo { hasNextPage } } repository { __typename ... repositoryFragment } } fragment botFragment on Bot { __typename id login avatarUrl } fragment repositoryFragment on Repository { __typename id createdAt updatedAt isFork isArchived nameWithOwner url isPrivate owner { __typename id } } fragment userFragment on User { __typename id login avatarUrl } fragment labelFragment on Label { __typename id name color createdAt updatedAt } { viewer { __typename issues(first: 100, states: [OPEN]) { edges { node { __typename ... issueFragment } <…>
default 12:05:48.814580-0400    Trailer Status update: GitHub: Open PRs
default 12:05:50.074615-0400    Trailer order window: 22b7 op: 0 relative: 0 related: 0
default 12:05:59.932531-0400    Trailer Will pause and retry call to https://api.github.com/graphql
default 12:06:15.706038-0400    Trailer Will pause and retry call to https://api.github.com/graphql
default 12:06:22.025238-0400    Trailer Setting LAST_PREFS_TAB_SELECTED_OSX to 0
default 12:06:26.627696-0400    Trailer Setting LAST_PREFS_TAB_SELECTED_OSX to 11
default 12:06:29.453216-0400    Trailer Setting LAST_PREFS_TAB_SELECTED_OSX to 0
default 12:06:31.940960-0400    Trailer Will pause and retry call to https://api.github.com/graphql
default 12:06:32.358858-0400    Trailer order window front conditionally: 22c1 related: 0
default 12:06:35.945935-0400    Trailer order window: 22c1 op: 0 relative: 0 related: 0
default 12:06:49.423100-0400    Trailer Will pause and retry call to https://api.github.com/graphql
default 12:06:59.815551-0400    Trailer Setting LAST_PREFS_TAB_SELECTED_OSX to 1
default 12:07:01.676331-0400    Trailer Setting NEW_REPO_CHECK_PERIOD to 2.0
default 12:07:01.692583-0400    Trailer order window front conditionally: 23f0 related: 0
default 12:07:04.500174-0400    Trailer order window front conditionally: 22c1 related: 0
default 12:07:05.236879-0400    Trailer Failed call to https://api.github.com/graphql
default 12:07:05.237083-0400    Trailer Status update: GitHub: Open Issues
default 12:07:06.696990-0400    Trailer order window: 22c1 op: 0 relative: 0 related: 0
default 12:07:06.825124-0400    Trailer API data from https://api.github.com/graphql: ByteBuffer { readerIndex: 0, writerIndex: 46324, readableBytes: 46324, capacity: 65536, storageCapacity: 65536, slice: _ByteBufferSlice { 0..<65536 }, storage: 0x0000000158008000 (65536 bytes) }
default 12:07:06.830311-0400    Trailer (TQL 'GitHub: Open Issues') Received page (Cost: 1, Remaining: 4659/5000 - Expected Count: 10255 - Returned Count: 10255)
default 12:07:06.830383-0400    Trailer (TQL 'GitHub: Open Issues') Scanning result
default 12:07:06.830545-0400    Trailer (TQL 'GitHub: Open Issues') Scanning result
default 12:07:06.830636-0400    Trailer Status update: Authored Items
default 12:07:06.831142-0400    Trailer (TQL 'GitHub: Open Issues') Parsed all pages
default 12:07:06.831259-0400    Trailer (TQL 'GitHub: Open Issues') Parsed all pages
default 12:07:06.831656-0400    Trailer Processing GQL nodes: Label: 20, Issue: 7, User: 9, Repository: 5
default 12:07:06.834255-0400    Trailer Creating Issue ID: I_kwDOExVSz85UrDoI (v4)
default 12:07:06.834950-0400    Trailer Creating Issue ID: I_kwDOEmZxM848sUfr (v4)
default 12:07:06.835493-0400    Trailer Creating Issue ID: I_kwDOEmZxM849uBxS (v4)
default 12:07:06.835660-0400    Trailer Creating Issue ID: I_kwDOEmZxM85hOq3X (v4)
default 12:07:06.835819-0400    Trailer Creating Issue ID: I_kwDOEn8W285RZilp (v4)
default 12:07:06.836303-0400    Trailer Creating Issue ID: I_kwDOEn8W285hO4fC (v4)
default 12:07:06.836472-0400    Trailer Creating Issue ID: I_kwDOGVBfCM5RUV7s (v4)
default 12:07:06.837371-0400    Trailer Creating PRLabel ID: LA_kwDOExVSz88AAAABGPqFBA (v4)
default 12:07:06.837584-0400    Trailer Creating PRLabel ID: LA_kwDOExVSz88AAAABGPqFCA (v4)
default 12:07:06.837735-0400    Trailer Creating PRLabel ID: LA_kwDOExVSz88AAAABGPqFDA (v4)
default 12:07:06.837879-0400    Trailer Creating PRLabel ID: LA_kwDOExVSz88AAAABGQfioQ (v4)
default 12:07:06.838022-0400    Trailer Creating PRLabel ID: LA_kwDOEmZxM88AAAABDeXTzA (v4)
default 12:07:06.838160-0400    Trailer Creating PRLabel ID: LA_kwDOEmZxM88AAAABDeXT1g (v4)
default 12:07:06.838295-0400    Trailer Creating PRLabel ID: LA_kwDOEmZxM88AAAABDeXT1w (v4)
default 12:07:06.838433-0400    Trailer Creating PRLabel ID: LA_kwDOEmZxM88AAAABDejd5w (v4)
default 12:07:06.838565-0400    Trailer Creating PRLabel ID: LA_kwDOEn8W288AAAABDNPnGA (v4)
default 12:07:06.838700-0400    Trailer Creating PRLabel ID: LA_kwDOEn8W288AAAABDNPnGQ (v4)
default 12:07:06.838831-0400    Trailer Creating PRLabel ID: LA_kwDOEn8W288AAAABDNPnGg (v4)
default 12:07:06.838973-0400    Trailer Creating PRLabel ID: LA_kwDOEn8W288AAAABDNd_4A (v4)
default 12:07:06.839294-0400    Trailer Creating PRLabel ID: LA_kwDOGVBfCM8AAAABDJlTSA (v4)
default 12:07:06.839445-0400    Trailer Creating PRLabel ID: LA_kwDOGVBfCM8AAAABDJlTSw (v4)
default 12:07:06.839586-0400    Trailer Creating PRLabel ID: LA_kwDOGVBfCM8AAAABDJlTVA (v4)
default 12:07:06.839711-0400    Trailer Creating PRLabel ID: LA_kwDOGVBfCM8AAAABDJ27gQ (v4)
default 12:07:07.779143-0400    Trailer API data from https://api.github.com/graphql: ByteBuffer { readerIndex: 0, writerIndex: 5870, readableBytes: 5870, capacity: 16384, storageCapacity: 16384, slice: _ByteBufferSlice { 0..<16384 }, storage: 0x000000012e8c6600 (16384 bytes) }
default 12:07:07.780575-0400    Trailer (TQL 'Authored Items') Received page (Cost: 2, Remaining: 4657/5000 - Expected Count: 4100 - Returned Count: 4100)
default 12:07:07.780714-0400    Trailer (TQL 'Authored Items') Scanning result
default 12:07:07.780988-0400    Trailer (TQL 'Authored Items') Scanning result
default 12:07:07.781035-0400    Trailer Status update: Authored Items
default 12:07:07.781402-0400    Trailer (TQL 'Authored Items') Parsed all pages
default 12:07:07.781591-0400    Trailer (TQL 'Authored Items') Parsed all pages
default 12:07:07.782053-0400    Trailer Processing GQL nodes: User: 6, Label: 1, PullRequest: 3, Repository: 3, Organization: 3
default 12:07:07.786207-0400    Trailer Creating PullRequest ID: PR_kwDOEwEi8M5VGI4G (v4)
default 12:07:07.786972-0400    Trailer Creating PullRequest ID: PR_kwDOEwEi8M5VHYnd (v4)
default 12:07:07.787200-0400    Trailer Creating PullRequest ID: PR_kwDOEwEi8M5VM_CH (v4)
default 12:07:07.787904-0400    Trailer Creating PRLabel ID: LA_kwDOEwEi8M8AAAABRb5t-Q (v4)
default 12:07:08.246576-0400    Trailer API data from https://api.github.com/graphql: ByteBuffer { readerIndex: 0, writerIndex: 199, readableBytes: 199, capacity: 16384, storageCapacity: 16384, slice: _ByteBufferSlice { 0..<16384 }, storage: 0x000000012f033000 (16384 bytes) }
default 12:07:08.246987-0400    Trailer (TQL 'Authored Items') Received page (Cost: 2, Remaining: 4655/5000 - Expected Count: 4100 - Returned Count: 4100)
default 12:07:08.247056-0400    Trailer (TQL 'Authored Items') Scanning result
default 12:07:08.247189-0400    Trailer (TQL 'Authored Items') Scanning result
default 12:07:08.247359-0400    Trailer (TQL 'Authored Items') Parsed all pages
default 12:07:08.247442-0400    Trailer (TQL 'Authored Items') Parsed all pages
default 12:07:08.247692-0400    Trailer Processing GQL nodes:
default 12:07:08.249692-0400    Trailer Status update: Processing 33 items…
default 12:07:08.249995-0400    Trailer Rolling back changes for failed sync on API server 'GitHub'
default 12:07:08.251969-0400    Trailer Nuked total 0 items marked for deletion
default 12:07:08.252432-0400    Trailer Committing synced data
default 12:07:08.252526-0400    Trailer Synced data committed
default 12:07:08.252558-0400    Trailer Saving DB
default 12:07:08.254170-0400    Trailer Postprocess done - 0.0013219118118286133 sec
default 12:07:08.255312-0400    Trailer No DB changes
default 12:07:08.255336-0400    Trailer Refresh done
default 12:07:08.255429-0400    Trailer Status update: Last update failed
default 12:07:08.269026-0400    Trailer order window: 23ea op: 0 relative: 0 related: 0
default 12:07:08.271878-0400    Trailer Updating general PullRequest menu, X total items
ptsochantaris commented 1 year ago

Indeed, definitely doesn't look like an API throttle issue @amayers - this looks way more like a GraphQL query problem. Two of the queries are failing there. It's odd that your log isn't showing the API error coming from the server. If you turn off Authored Items sync does the issue go away?

Also, are any of the repos you're following public? I'd love to try and reproduce the issue locally.

amayers commented 1 year ago

@ptsochantaris I tried turning off the authored items sync, but that didn't fix it. I also just tried downgrading to v1.7.6, and the console logs have more details for the API responses. No, none of these repos are public. Also of possible note, I'm using a fine grained personal access token (my organization now requires it). However Trailer did work with this token as of a week or two ago.

default 12:37:03.278708-0400    Trailer Status update: GitHub: Open PRs
default 12:37:03.279405-0400    Trailer Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> resuming, timeouts(60.0, 604800.0) QOS(0x9) Voucher (null)
default 12:37:03.281167-0400    Trailer [Telemetry]: Activity <nw_activity 12:2[A7B09696-3341-4ABD-B614-549130F22252] (reporting strategy default)> on Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> was not selected for reporting
default 12:37:03.281853-0400    Trailer Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> {strength 0, tls 4, sub 0, sig 1, ciphers 0, bundle 0, builtin 0}
default 12:37:03.282088-0400    Trailer [C2] event: client:connection_reused @79.669s
default 12:37:03.283298-0400    Trailer Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> now using Connection 2
default 12:37:03.283874-0400    Trailer Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> sent request, body S 1276
default 12:37:06.721988-0400    Trailer [C2] event: client:data_stall @83.109s
default 12:37:13.751696-0400    Trailer Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> received response, status 502 content U
default 12:37:13.752280-0400    Trailer Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> done using Connection 2
default 12:37:13.752438-0400    Trailer [C2] event: client:connection_idle @90.139s
default 12:37:13.752876-0400    Trailer Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> response ended
default 12:37:13.753189-0400    Trailer Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> summary for task success {transaction_duration_ms=10471, response_status=502, connection=2, reused=1, request_start_ms=1, request_duration_ms=0, response_start_ms=10470, response_duration_ms=1, request_bytes=1398, response_bytes=707, cache_hit=true}
default 12:37:13.755007-0400    Trailer Task <2C33EA7E-6653-4770-B23A-E6E084A5626A>.<17> finished successfully
default 12:37:13.755760-0400    Trailer API data from https://api.github.com/graphql: {
   "data": null,
   "errors":[
      {
         "message":"Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `CF2E:728F:1A97DE8:360CCCE:64AD8520` when reporting this issue."
      }
   ]
}
default 12:37:13.756566-0400    Trailer (GQL 'GitHub: Open PRs') Received page (No stats)
default 12:37:13.757046-0400    Trailer (GQL 'GitHub: Open PRs')  Error: Failed with error: 'Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `CF2E:728F:1A97DE8:360CCCE:64AD8520` when reporting this issue.'
default 12:37:13.757168-0400    Trailer (GQL 'GitHub: Open PRs')  Pausing for retry, attempt 4
ptsochantaris commented 1 year ago

Oh I know why you can't see error messages in the log - it's a clear bug there, this build will output errors correctly, so maybe that can give us some insight. BTW Thanks for your patience everyone, we'll get it sorted.

Trailer.app.zip

Reminder to self: Implement a user-visible sync log already!!!

ptsochantaris commented 1 year ago

@amayers The error that you see there is a clear throttling error from GitHub. Even though Trailer does query things inside the boundaries of the API limits, it seems that some queries just "overflow" internally. That's why Trailer has a retry mechanism (you can see the message for that at the end of your log). One of the best ways to calm GH is to just not run Trailer for a few minutes. I'm going to try heavy-handedly reducing query sizes in a build and seeing if that improves things.

(BTW the fine grained token you mention shouldn't make a difference)

amayers commented 1 year ago

Using that build I do see more response details. But I'm not really seeing any additional details on the error its self. I'll try dialing back some of Trailer's settings so hopefully it makes fewer requests, and less in each.

ptsochantaris commented 1 year ago

On a weirdly good side, this doesn't look like an issue with the rate per se, more like the specific query, so I can at least put back the multithreaded querying. But of course that leaves us with the mystery about why this query is failing. This build here cuts the PR batch by half for the Open PRs query (from 50 to 25 per page) - let's see if it helps.

Trailer.app.zip

ptsochantaris commented 1 year ago

(BTW If anyone wants to forward any queries or API results which may help but don't want to make them public, you can always reach me at my email which is my GH handle at me.com)

amayers commented 1 year ago

That build doesn't fix my issue (app still shows X). But I do see a lot more successful responses in the logs. I filed a ticket with Github with: Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please includeCF2E:728F:1A97DE8:360CCCE:64AD8520when reporting this issue. Hopefully they can identify what part of the request is causing the issue.

ptsochantaris commented 1 year ago

Nice, thanks for trying that. I'm just cooking up a build in which you can configure the page sizes manually so perhaps you can experiment. Will post it here shortly.

ptsochantaris commented 1 year ago

@amayers This build has a ... button next to the v4 API checkbox in preferences. From there you can change the page size of queries. The defaults are quite conservative, and I've made some other bits of those queries lighter as well, but I'd be interested to see what kind of results you get. Thanks so much for helping with this!

[Edit: Have put this up as an update to play it safe, but please do give it a try when you find the time to see if it helps you]

amayers commented 1 year ago

@ptsochantaris Around 5 PRs/page it started to work with an occasional timeout/retry. So I lowered it to 4 for now and that seems good. I then turned back on most of the other details in the requests (reactions, merge conflict, line counts) and it seems to be working with all of that. The issues/page doesn't matter as our org doesn't use issues so they are disabled on all these repos.

Thank you so much for this work! I don't see any way to send you some money. Would you like to share a Cashtag, Venmo, or other way to send you something for your work?

ptsochantaris commented 1 year ago

Wow, 4 is extremely low - I mean don't get me wrong, I'm super glad you're unblocked but someone with 5 repos should not even come close to causing an API timeout on GH, even if paging was close to 100. It must be some strange corner case which I'll try to keep in mind when going through the code. If you do think you're able to share any of the queries Trailer sends to my email (I totally appreciate this may not be possible BTW) then it may go a long way in helping me try to diagnose what kind of weird corner case is at work here :D

BTW statuses have been known to cause horrors in the past, so you may want to try disabling those to see if that somehow "unlocks" your page limit, although I realise Trailer is supposed to help with your work, not become your work :D So no pressure.

On the issue of money I've always maintained that I get as much from working on Trailer as I put into it and that's been good enough, but times are getting a bit tougher and at some point I will put up a sponsorship link to see if my hobbies can earn some pocket money, but that's a possibility in the future. I definitely thank you for your very kind offer though. If Trailer keeps making you happy in the future, feel free to pass by the repo and see if there's a monetisation link :D

amayers commented 1 year ago

@ptsochantaris I'm happy to share the request/response details via email. What's your email?

Normally I do have the Show PR CI / statuses enabled. But I disabled it while doing all this debugging, and so far haven't turned it back on. So that doesn't seem to be the limiting factor in this case.

ptsochantaris commented 1 year ago

Ah, that's good to know. Which makes things weirder and more interesting. My email is my GitHub handle (minus the leading "at" of course) at me.com - any and all info you are comfortable with sending over will be super helpful!

ptsochantaris commented 1 year ago

Here is an updated version which adds additional safeguards to batched GQL calls - it seems that even though Trailer respects GitHub's 500,000 node rule, anything above 40-50k nodes causes a timeout. This build enforces that limit which should mean there's no need to artificially limit item paging sizes or multithreaded queries.

[edit: removed, updated version below]

ptsochantaris commented 1 year ago

Another iteration - also applies paging restrictions to things like review comments and reactions. Many thanks @amayers for helping with the testing!

[edit: removed, new build available below]

ptsochantaris commented 1 year ago

A specifically tweaked build that records response times in queries for testing - please note this build will not respect any settings from the v4 API paging settings in prefs.

[edit: removed, new build available below]

ptsochantaris commented 1 year ago

This build replaces the v4 sync settings with 3 presets: Safe / Default / High - Default being the same as the previous build, Safe is considerably lighter and worth trying if v4 API times out, while High is worth trying to batch up queries to a large extent to reduce API usage cost.

[edit: removed, new build available below]

Jecoms commented 1 year ago

I tried that latest build yesterday and I was having success when using the Safe mode with all of my repos marked as participating (50+).

Away from my work computer currently, but I'll confirm next week that it's performing as expected for a number of syncs.

ptsochantaris commented 1 year ago

Thanks for testing @Jecoms - sounds like "Safe" may have to be the actual default on the next release. Let me know how it goes 👍

Jecoms commented 1 year ago

@ptsochantaris It's looking good to me. API Limit remains a green sliver while using Safe mode. Thanks for the iteration to a solution for this!

I'll try leaving it on Default for a bit as well. Default seems to be working okay as well. Maybe github made some improvements to their gql endpoint and we're not hitting the timeouts anymore (if default is querying the same as before).

Also now seems okay with High. I'll leave it here for the rest of the day an update if I hit any issues.

ptsochantaris commented 1 year ago

@Jecoms Thanks so much for testing, I really appreciate it! I suspect that if you selected "reload all data" from Preferences -> Misc while "High" is selected then it will fail, as that sync transfers a much larger amount of data, but incremental syncs after that should very likely be fine on "High". If things continue to look fine I'll put up an "official" update a little later today or tomorrow 👍

wassimk commented 1 year ago

:wave: I have this sync problem as well. I've been following along and trying every new build. This latest one in safe mode is now working for me!

One change, I believe, helped was removing all custom-watched repositories and adding the minimum I needed to get by at work. I now have 20 repositories in total.

ptsochantaris commented 1 year ago

Thanks for the update @wassimk - it's very helpful; clearly GH's GraphQL is very sensitive when querying a large amount of repos in one query - I'll be sure to keep the safe setting as a default then, and maybe even consider a lighter one too.

ptsochantaris commented 1 year ago

Here's another update - this one features an even lighter mode if needed, as well as improved handling of weird burps that GH has been having where it returns zero-byte responses.

[edit: removed, new build available below]

ptsochantaris commented 1 year ago

... and another one that introduces better logging in case of invalid JSON coming from GH

[edit: removed, new build available below]

ptsochantaris commented 1 year ago

... and a fix that tries to recover from occasional GH timeouts

[edit: removed, new build available now]

ptsochantaris commented 1 year ago

... and one that dumps the new network library used in 1.8.x, as I've seen logs that have truncated responses

[edit: removed, new build available now]

ptsochantaris commented 1 year ago

This build adds a simpler log viewer in Prefs -> Misc which displays the current activity in Trailer, which can be helpful when reporting syncing issues or even just browsing the API queries and responses of syncs.

Trailer-184-test10.zip

amayers commented 1 year ago

Ok, that last build seems to be working for me now. It looks like the automatic pause & retry of any 403 responses has caused it to eventually succeed. I see around 5 of the 403s back to back, before it succeeds. Thanks for all the work on this!

ptsochantaris commented 1 year ago

@amayers That's great news - I wish GitHub had a way to proactively tell the clients if/when they are close to some rate limit instead of having to wait for a 403, it would really help tune all this without all this guesswork. I really appreciate your help with this, and I'll get onto preparing an update with some cleanups so that other people with a similar issue can use the v4API as well.

ptsochantaris commented 1 year ago

v.1.8.4 is now up, and contains all the updates from this thread, as well as a stability fix. You can update from any build by selecting "check for updates" from the prefs window, or just wait for Trailer to ping you :) Thank you to everyone who tested or who gave feedback. I'm going to close this issue now, please feel free to open a new one if any issues persist.