pingcap / ossinsight

Analysis, Comparison, Trends, Rankings of Open Source Software, you can also get insight from more than 7 billion with natural language (powered by OpenAI). Follow us on Twitter: https://twitter.com/ossinsight
https://ossinsight.io/
Apache License 2.0
1.77k stars 332 forks source link

[Explorer] Some problem for question c8cbf94d-4a9d-4fe8-82ce-e852084470ea: projects with the most contributors in the last year #1774

Open lietapa opened 1 month ago

lietapa commented 1 month ago

Hi, I have some problems with the question c8cbf94d-4a9d-4fe8-82ce-e852084470ea projects with the most contributors in the last year (errorType = none):

By just adding "20" in front of the above question, the generated SQL is quite different, and probably closer to the expected result than the above one: https://ossinsight.io/explore/?id=7c96df5f-7d3d-45d4-b802-89076f3fb109.

Context: I am trying to put some numbers on the contents of the last Octoverse and some assertions that I find there:

Generated SQL 1

SELECT
  ge.repo_name AS repo_name,
  COUNT(DISTINCT ge.actor_id) AS contributors
FROM
  github_events ge
WHERE
  ge.type = 'PullRequestEvent'
  AND ge.action = 'opened'
  AND YEAR(ge.created_at) = YEAR(CURRENT_DATE()) - 1
GROUP BY
  ge.repo_name
ORDER BY
  contributors DESC
LIMIT
  5

Chart:

{
  "chartName": "BarChart",
  "title": "Top 5 Projects with the Most Contributors in the Last Year",
  "x": "repo_name",
  "y": "contributors"
}

Result:

// Fields
  [
  {
    "columnType": 253,
    "name": "repo_name"
  },
  {
    "columnType": 8,
    "name": "contributors"
  }
]

// First result (Totally 5 rows)
  {
  "contributors": 13863,
  "repo_name": "firstcontributions/first-contributions"
}

Generated SQL 2

SELECT
  gr.repo_name,
  COUNT(DISTINCT ge.actor_login) AS contributors
FROM
  github_events ge
  INNER JOIN github_repos gr ON ge.repo_id = gr.repo_id
WHERE
  ge.type = 'PullRequestEvent'
  AND ge.action = 'closed'
  AND YEAR(ge.created_at) = YEAR(CURDATE()) - 1
GROUP BY
  gr.repo_name
ORDER BY
  contributors DESC
LIMIT
  20