Closed github-actions[bot] closed 2 years ago
Is it unexpected that this didn't open a PR in every tooling repo?
There wasn't one created at least for exercism/nim-test-runner
: https://github.com/exercism/nim-test-runner/pulls?q=is%3Apr
exercism/nim-test-runner
doesn't appear in the output of the "Fetch list of target repos" step:
https://github.com/exercism/org-wide-files/runs/6402899031?check_suite_focus=true#step:2:40
Is it unexpected that this didn't open a PR in every tooling repo?
Yes.
I can't replicate the issue when sending a manual request to the API.
Getting the list of exercism repos with the exercism-tooling
topic:
gh search repos --owner=exercism --topic=exercism-tooling --limit 200 | cut -f1,3 | sort
The repos that do have the topic, but the workflow did not target:
exercism/delphi-test-runner
exercism/DEPRECATED.rikki-ruby-analyzer
exercism/nim-test-runner
exercism/stub-analyzer
Of those:
exercism/DEPRECATED.rikki-ruby-analyzer
and exercism/stub-analyzer
are archivedexercism/delphi-test-runner
is private.So that leaves us with:
exercism/nim-test-runner
The gh
command at the top of this post gives us the below table. For some reason, it was missing exactly:
exercism/plsql-test-runner
exercism/prolog-test-runner
exercism/rust-test-runner
at one point, even though those repos do have the topic...
Repo | Notes |
---|---|
abap-test-runner | public |
bash-analyzer | public |
bash-test-runner | public |
cfml-test-runner | public |
clojure-analyzer | public |
clojure-representer | public |
clojurescript-test-runner | public |
clojure-test-runner | public |
coffeescript-test-runner | public |
common-lisp-analyzer | public |
common-lisp-representer | public |
common-lisp-test-runner | public |
cpp-test-runner | public |
c-representer | public |
c-representer | public |
crystal-test-runner | public |
crystal-test-runner | public |
csharp-analyzer | public |
csharp-representer | public |
csharp-test-runner | public |
c-test-runner | public |
dart-test-runner | public |
delphi-test-runner | private |
DEPRECATED.rikki-ruby-analyzer | public, archived |
d-test-runner | public |
elixir-analyzer | public |
elixir-representer | public |
elixir-test-runner | public |
elm-analyzer | public |
elm-representer | public |
elm-test-runner | public |
emacs-lisp-test-runner | public |
erlang-analyzer | public |
erlang-representer | public |
erlang-test-runner | public |
fortran-test-runner | public |
fsharp-analyzer | public |
fsharp-representer | public |
fsharp-test-runner | public |
generic-analyzer | public |
generic-representer | public |
generic-test-runner | public |
go-analyzer | public |
go-representer | public |
go-test-runner | public |
groovy-test-runner | public |
haskell-test-runner | public |
haxe-analyzer | public |
haxe-representer | public |
haxe-test-runner | public |
j-analyzer | public |
java-analyzer | public |
java-representer | public |
javascript-analyzer | public |
javascript-representer | public |
javascript-test-runner | public |
java-test-runner | public |
j-representer | public |
j-test-runner | public |
julia-analyzer | public |
julia-representer | public |
julia-test-runner | public |
kotlin-analyzer | public |
kotlin-test-runner | public |
lfe-test-runner | public |
lua-test-runner | public |
mips-test-runner | public |
nim-representer | public |
nim-test-runner | public |
objective-c-test-runner | public |
ocaml-test-runner | public |
perl5-analyzer | public |
perl5-test-runner | public |
pharo-smalltalk-test-runner | public |
php-test-runner | public |
purescript-test-runner | public |
python-analyzer | public |
python-representer | public |
python-test-runner | public |
racket-test-runner | public |
raku-test-runner | public |
reasonml-test-runner | public |
red-analyzer | public |
red-representer | public |
red-test-runner | public |
r-test-runner | public |
ruby-analyzer | public |
ruby-representer | public |
ruby-test-runner | public |
rust-analyzer | public |
rust-representer | public |
rust-test-runner | public |
scala-analyzer | public |
scala-test-runner | public |
scheme-test-runner | public |
sml-test-runner | public |
stub-analyzer | public, archived |
swift-test-runner | public |
tcl-test-runner | public |
typescript-analyzer | public |
typescript-representer | public |
typescript-test-runner | public |
vbnet-test-runner | public |
vimscript-test-runner | public |
wasm-test-runner | public |
wren-representer | public |
wren-test-runner | public |
x86-64-assembly-test-runner | public |
zig-test-runner | public |
To others: Sascha manually triggered https://github.com/exercism/nim-test-runner/pull/138
That's really strange. Perhaps fetching the full repo object and comparing them between the listed and unlisted repos could show some differences that explain this?
This might be an issue with the GH search itself, too.
(Sorry for the less than helpful comments, can't look into it properly atm due to PC issues)
Maybe the problem is that when the tool does this it's paginated and only looks at the first page?
That was my initial thought too, but the call to github.paginate
should take care of it.
Sending the same query via curl shows that we should not be hitting the rate limits either:
$ curl \
> -H "Accept: application/vnd.github.v3+json" \
> https://api.github.com/search/repositories?q=org:exercism+topic:exercism-tooling+is:public+archived:false
{
"total_count": 106,
"incomplete_results": false,
"items": [
Could you retry and see if the rust-test-runner
works now pls?
Could you retry and see if the
rust-test-runner
works now pls?
The below command outputs 109 repo names, and includes rust-test-runner
.
I think I was seeing fewer at one point. But it might have been an intermittent rate-limiting or pagination thing. And --limit 200
might be optimistic for gh search repos
, but it isn't immediately documented that it won't work.
Did you change something?
$ gh search repos --owner=exercism --topic=exercism-tooling --limit 200 | cut -f1 | sort
exercism/abap-test-runner
exercism/bash-analyzer
exercism/bash-test-runner
exercism/cfml-test-runner
exercism/clojure-analyzer
exercism/clojure-representer
exercism/clojurescript-test-runner
exercism/clojure-test-runner
exercism/coffeescript-test-runner
exercism/common-lisp-analyzer
exercism/common-lisp-representer
exercism/common-lisp-test-runner
exercism/cpp-test-runner
exercism/c-representer
exercism/crystal-test-runner
exercism/csharp-analyzer
exercism/csharp-representer
exercism/csharp-test-runner
exercism/c-test-runner
exercism/dart-test-runner
exercism/delphi-test-runner
exercism/DEPRECATED.rikki-ruby-analyzer
exercism/d-test-runner
exercism/elixir-analyzer
exercism/elixir-representer
exercism/elixir-test-runner
exercism/elm-analyzer
exercism/elm-representer
exercism/elm-test-runner
exercism/emacs-lisp-test-runner
exercism/erlang-analyzer
exercism/erlang-representer
exercism/erlang-test-runner
exercism/fortran-test-runner
exercism/fsharp-analyzer
exercism/fsharp-representer
exercism/fsharp-test-runner
exercism/generic-analyzer
exercism/generic-representer
exercism/generic-test-runner
exercism/go-analyzer
exercism/go-representer
exercism/go-test-runner
exercism/groovy-test-runner
exercism/haskell-test-runner
exercism/haxe-analyzer
exercism/haxe-representer
exercism/haxe-test-runner
exercism/j-analyzer
exercism/java-analyzer
exercism/java-representer
exercism/javascript-analyzer
exercism/javascript-representer
exercism/javascript-test-runner
exercism/java-test-runner
exercism/j-representer
exercism/j-test-runner
exercism/julia-analyzer
exercism/julia-representer
exercism/julia-test-runner
exercism/kotlin-analyzer
exercism/kotlin-test-runner
exercism/lfe-test-runner
exercism/lua-test-runner
exercism/mips-test-runner
exercism/nim-representer
exercism/nim-test-runner
exercism/objective-c-test-runner
exercism/ocaml-test-runner
exercism/perl5-analyzer
exercism/perl5-test-runner
exercism/pharo-smalltalk-test-runner
exercism/php-test-runner
exercism/plsql-test-runner
exercism/prolog-test-runner
exercism/purescript-test-runner
exercism/python-analyzer
exercism/python-representer
exercism/python-test-runner
exercism/racket-test-runner
exercism/raku-test-runner
exercism/reasonml-test-runner
exercism/red-analyzer
exercism/red-representer
exercism/red-test-runner
exercism/r-test-runner
exercism/ruby-analyzer
exercism/ruby-representer
exercism/ruby-test-runner
exercism/rust-analyzer
exercism/rust-representer
exercism/rust-test-runner
exercism/scala-analyzer
exercism/scala-test-runner
exercism/scheme-test-runner
exercism/sml-test-runner
exercism/stub-analyzer
exercism/swift-test-runner
exercism/tcl-test-runner
exercism/typescript-analyzer
exercism/typescript-representer
exercism/typescript-test-runner
exercism/vbnet-test-runner
exercism/vimscript-test-runner
exercism/wasm-test-runner
exercism/wren-representer
exercism/wren-test-runner
exercism/x86-64-assembly-test-runner
exercism/zig-test-runner
I removed leading white space from the rust-test-runner
s topic. But I don't know if I actually did, because in most of the UI that doesn't show, but on one specific bit it did. So I'm wondering if it's strip
'd in most places, but not in some.
@ee7 could you submit a PR that adds your whitespace checks to GitHub please? :D
I removed leading white space from the
rust-test-runner
s topic.
I see.
By the way, is it intended that https://github.com/exercism/delphi-test-runner is private? As far as I can see, it's the only private exercism repo:
$ gh repo list exercism --visibility private
Showing 1 of 1 repositories in @exercism that match your search
exercism/delphi-test-runner private Oct 5, 2021
(It's probably also better to use gh repo list exercism
than gh search repos --owner=exercism
).
By the way, is it intended that https://github.com/exercism/delphi-test-runner is private? As far as I can see, it's the only private exercism repo:
Yes. It contains delphi's source code, which is proprietory (delphi is our only non open-source language AFAICT)
Yes. It contains delphi's source code
Ah.
@ee7 could you submit a PR that adds your whitespace checks to GitHub please? :D
@SaschaMann You jest, but I've actually been considering a repo that runs periodic checks on Exercism repo metadata. Then we can get notifications for things like:
I suppose it could even be a periodic workflow in exercism/configlet
.
(cc @kytrinyx to this too)
If you're gonna implement that, please add "repo is empty" as that has caused us issues a few times already :)
please add "repo is empty"
You'd like a failing check when any Exercism repo has isEmpty
as true
in the below?
The 5 repos with the lowest disk usage currently:
gh repo list exercism \
--limit 500 --no-archived --visibility public --json diskUsage,isEmpty,name \
| jq 'sort_by(.diskUsage) | .[0:5]'
[
{
"diskUsage": 6,
"isEmpty": false,
"name": "babel-preset-javascript"
},
{
"diskUsage": 11,
"isEmpty": false,
"name": "guys-slack-bot"
},
{
"diskUsage": 12,
"isEmpty": false,
"name": "open-source-stats"
},
{
"diskUsage": 12,
"isEmpty": false,
"name": "backlog"
},
{
"diskUsage": 13,
"isEmpty": false,
"name": "wip"
}
]
(https://github.com/exercism/babel-preset-javascript is quite small: CODE_OF_CONDUCT.md
and labels).
that has caused us issues a few times already :)
https://github.com/exercism/org-wide-files/issues/110? Can we filter out empty repos before we fork?
https://github.com/exercism/org-wide-files/issues/110? Can we filter out empty repos before we fork?
Yes but I'd argue that the org-wide-files should be synced to those repos as well and that a repo should generally not be empty, even if it only has a dummy commit. It seems like an easier fix to not have empty repos than to add logic to this and the webhooks to handle those situations.
Yes but I'd argue that the org-wide-files should be synced to those repos as well and that a repo should generally not be empty, even if it only has a dummy commit. It seems like an easier fix to not have empty repos than to add logic to this and the webhooks to handle those situations.
Agreed. At the least, a repo should have a README.md file
The below command outputs 109 repo names, and includes rust-test-runner.
Do we have any open issues remaining regarding not all repos being targeted?
Yes but I'd argue that the org-wide-files should be synced to those repos as well and that a repo should generally not be empty, even if it only has a dummy commit. It seems like an easier fix to not have empty repos than to add logic to this and the webhooks to handle those situations.
Agreed. At the least, a repo should have a README.md file
I'm more persuaded by the counter-argument: it seems easier to be robust to the existence of an empty repo, rather than being blocked by one and needing manual intervention (so org-wide-files
jobs fail until we remove, or add content to, an empty repo that is encountered). A new repo might be empty for a short period while it's being set up, and it'd be easy to forget that it would break org-wide-files
jobs. Creating an empty repo might be convenient for someone while setting something up, and the fact that it has happened before means that it can happen in the future.
I could also claim that an empty repo does not do anything yet, and so does not urgently need to be targeted by the next org-wide-files
PR. Furthermore, an empty repo might not yet have the correct topics set, so org-wide-files
cannot sync the correct files anyway.
We could have a periodic job that runs e.g. daily, and does things like producing an error if there is an empty repo. But that doesn't help if an empty repo is created shortly before an org-wide-files
job runs, and making the "exercism repo linting" job prevent an org-wide-files
job is even harder than filtering empty repos, and produces the same outcome - org-wide-files
being blocked.
But I probably don't have the full picture, and I'm not the one who has to deal with it. Do whatever's easiest, of course.
The below command outputs 109 repo names, and includes rust-test-runner.
Do we have any open issues remaining regarding not all repos being targeted?
I think only this issue. I believe org-wide-files
has now targeted every repo we wanted - it was only nim-test-runner
that was not targeted in this case.
It's strange that nim-test-runner
was missing, though.
It does have the topic:
gh repo view exercism/nim-test-runner --json repositoryTopics
{
"repositoryTopics": [
{
"name": "exercism-tooling"
},
{
"name": "exercism-test-runner"
}
]
}
In case it's a weird rate-limiting problem, maybe we could try using GITHUB_TOKEN
to make an authenticated request?
And/or switch to gh
(or curl
), if we're suspicious that it might perform better than actions/github-script
?
Getting all the Exercism-owned, public, non-archived, repos with the exercism-tooling
topic with gh
:
gh repo list exercism \
--visibility public \
--no-archived \
--limit 500 \
--topic exercism-tooling \
--json name \
--jq 'sort_by(.name)' \
| jq '.[:3]'
(The final pipe to jq
is just to slice to keep the output short below).
[
{
"name": "abap-test-runner"
},
{
"name": "bash-analyzer"
},
{
"name": "bash-test-runner"
}
]
I think this avoids the search API, because if I write --limit 1234
I don't see this message: https://github.com/cli/cli/blob/9454ad376953/pkg/cmd/repo/list/list.go#L181-L182
In case it's a weird rate-limiting problem, maybe we could try using GITHUB_TOKEN to make an authenticated request?
I think it already does that: https://github.com/actions/github-script#actionsgithub-script
A pre-authenticated octokit/rest.js client with pagination plugins
I don't have a particular preference for github-script vs gh for this particular use case. Though I'm not sure if this would be as readable in Bash. These lines would need to be rewritten: https://github.com/exercism/org-wide-files/blob/cd205480e83eb23769696e0ad553ad53d5b94558/.github/workflows/sync-rest.yml#L78-L97
Not relying on the search API seems like a good idea for resilience.
But then what does it use? The graphql API? Something special?
I didn't look up what the gh command does but there's an endpoint that returns all repositories in an organisation. You could then filter these based on topics afterwards.
I think that is what they do: https://github.com/cli/cli/blob/00e97121ec8cf4de86bdda704fb711302a8dfc96/pkg/cmd/repo/list/http.go#L80
I think the source of truth is https://github.com/cli/cli/blob/00e97121ec8c/pkg/cmd/repo/list/http.go#L32, but I don't understand exactly what I'm seeing. It looks like both listRepos
and searchRepos
use the GraphQL API v4, and FromSearch
is set to true
if e.g. we filter non-archived or by topic.
(Edit: Erik beat me by 7 seconds).
If the graphql search API is used, I'd assume that also has rate limits, so are we gaining anything by not using github script?
Aha, I've worked it out.
If I write
gh repo list exercism --no-archived --topic exercism-tooling --limit 1234
Then the output does start with
warning: this query uses the Search API which is capped at 1000 results maximum
But the warning does not appear if I add some --json
option:
gh repo list exercism --no-archived --topic exercism-tooling --limit 1234 --json name
I think they could still print that warning on stderr (which is where it appears when --json
is not used).
If the graphql search API is used, I'd assume that also has rate limits, so are we gaining anything by not using github script?
I don't know. But the GraphQL API and the REST API do have different limits: https://docs.github.com/en/graphql/overview/resource-limitations#rate-limit
Maybe let's do nothing for now, and keep an eye on the number of repos targeted by an org-wide-files
run? Or we could e.g. obtain the repo list by two independent methods, and require them to agree.
Workflow run: https://github.com/exercism/org-wide-files/actions/runs/2306860844