apache / incubator-devlake

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
https://devlake.apache.org/
Apache License 2.0
2.52k stars 502 forks source link

[Bug][GitHub] Missing Pull Request data when changing the Time Range in Sync Policy #7704

Open d4x1 opened 2 weeks ago

d4x1 commented 2 weeks ago

Search before asking

What happened

Some users set up a new project whose connection is GitHub with onboarding guide, because https://github.com/apache/incubator-devlake/issues/7703, graphql API is not enabled.This produces two records in pull_requests table. When users update the time range in project's sync policy config, and click COLLECT DATA. The new pipeline is triggered and finished successfully. But the data in pull_requests table is incorrected: it still has only two records, but the new time range has more pull requests definitely.

What do you expect to happen

When Time Range option in Sync Policy Config changes, all pull requests should be fetched according to the new time range.

How to reproduce

Ses "what happened"

Anything else

No response

Version

2e56bdd

Are you willing to submit PR?

Code of Conduct

d4x1 commented 2 weeks ago

I think it's related to GitHub's PR collector, but due to the small rate limit in GitHub restful API. We should use graphql as much as possible.