ossf / scorecard

OpenSSF Scorecard - Security health metrics for Open Source
https://scorecard.dev
Apache License 2.0
4.58k stars 496 forks source link

GitHub token switching is not working in Cron #280

Closed inferno-chromium closed 3 years ago

inferno-chromium commented 3 years ago

Describe the bug Frequent running on rate limit exceeded

Reproduction steps https://console.cloud.google.com/logs/viewer?src=ac&project=openssf&minLogLevel=0&expandAll=false&timestamp=2021-03-16T16:19:32.553000000Z&customFacets=&limitCustomFacetWidth=true&advancedFilter=resource.type%3D%22k8s_container%22%0Aresource.labels.project_id%3D%22openssf%22%0Aresource.labels.location%3D%22us-central1-c%22%0Aresource.labels.cluster_name%3D%22openssf%22%0Aresource.labels.namespace_name%3D%22default%22%0Alabels.k8s-pod%2Fjob-name:%22daily-score-%22%20severity%3E%3DDEFAULT&interval=PT1H&scrollTimestamp=2021-03-16T16:17:53.186854132Z&dateRangeUnbound=both&dateRangeStart=2021-03-16T15:19:32.552Z&dateRangeEnd=2021-03-16T16:19:32.552Z

E 2021-03-16T16:17:53.002024178Z {"msg":"Rate limit exceeded. Retrying...","ts":1615911473.0010984,"level":"warn","caller":"roundtripper/roundtripper.go:171"} E 2021-03-16T16:17:53.001958963Z {"level":"warn","msg":"Rate limit exceeded. Retrying...","caller":"roundtripper/roundtripper.go:171","ts":1615911473.0002775} E 2021-03-16T15:24:23.182137060Z {"caller":"roundtripper/roundtripper.go:167","level":"warn","ts":1615908263.1819801,"msg":"Rate limit exceeded. Waiting 53m29.818030646s to retry..."} E 2021-03-16T15:24:23.052225987Z {"msg":"Rate limit exceeded. Waiting 53m29.947978265s to retry...","caller":"roundtripper/roundtripper.go:167","ts":1615908263.052036,"level":"warn"} E 2021-03-16T15:24:23.044774464Z {"msg":"Rate limit exceeded. Waiting 53m29.955391961s to retry...","ts":1615908263.0446181,"caller":"roundtripper/roundtripper.go:167","level":"warn"} E 2021-03-16T15:24:23.009648111Z {"level":"warn","ts":1615908263.0094867,"caller":"roundtripper/roundtripper.go:167","msg":"Rate limit exceeded. Waiting 53m29.990522824s to retry..."} E 2021-03-16T15:24:22.989419887Z {"ts":1615908262.9892304,"caller":"roundtripper/roundtripper.go:167","msg":"Rate limit exceeded. Waiting 53m30.010780051s to retry...","level":"warn"} E 2021-03-16T15:24:22.981095760Z {"caller":"roundtripper/roundtripper.go:167","msg":"Rate limit exceeded. Waiting 53m30.019108613s to retry...","level":"warn","ts":1615908262.9809232} I 2021-03-16T15:24:19.564465972Z github.com/invoiceninja/dockerfiles

Expected behavior Are we switching between tokens ? Is it same across one scorecard call ?

inferno-chromium commented 3 years ago

@naveensrinivasan @dlorenc - PTAL, this is slowly down scorecards scaling, just saw this last night.

naveensrinivasan commented 3 years ago

Yes, we are. I have been looking at it also. I will take another look. We have 4 tokens now, we are trying to 2280 GitHub Repo's and with each repo, there is N number of calls.

inferno-chromium commented 3 years ago

Yes, we are. I have been looking at it also. I will take another look. We have 4 tokens now, we are trying to 2280 GitHub Repo's and with each repo, there is N number of calls.

I increased to 6 last night.

i feel we process very few repos before we re-hit the rate limit. e.g.

2021-03-16T16:17:53.002037571Z {"msg":"Rate limit exceeded. Retrying...","caller":"roundtripper/roundtripper.go:171","ts":1615911473.0013971,"level":"warn"} E 
2021-03-16T16:17:53.002031954Z {"caller":"roundtripper/roundtripper.go:171","level":"warn","msg":"Rate limit exceeded. Retrying...","ts":1615911473.0012403} E 
2021-03-16T16:17:53.002024178Z {"caller":"roundtripper/roundtripper.go:171","ts":1615911473.0010984,"level":"warn","msg":"Rate limit exceeded. Retrying..."} E 
2021-03-16T16:17:53.001958963Z {"ts":1615911473.0002775,"caller":"roundtripper/roundtripper.go:171","msg":"Rate limit exceeded. Retrying...","level":"warn"} E 
2021-03-16T15:24:23.182137060Z {"level":"warn","caller":"roundtripper/roundtripper.go:167","msg":"Rate limit exceeded. Waiting 53m29.818030646s to retry...","ts":1615908263.1819801} E 
2021-03-16T15:24:23.052225987Z {"msg":"Rate limit exceeded. Waiting 53m29.947978265s to retry...","ts":1615908263.052036,"level":"warn","caller":"roundtripper/roundtripper.go:167"} E 
2021-03-16T15:24:23.044774464Z {"msg":"Rate limit exceeded. Waiting 53m29.955391961s to retry...","caller":"roundtripper/roundtripper.go:167","ts":1615908263.0446181,"level":"warn"} E 
2021-03-16T15:24:23.009648111Z {"level":"warn","ts":1615908263.0094867,"msg":"Rate limit exceeded. Waiting 53m29.990522824s to retry...","caller":"roundtripper/roundtripper.go:167"} E 
2021-03-16T15:24:22.989419887Z {"caller":"roundtripper/roundtripper.go:167","msg":"Rate limit exceeded. Waiting 53m30.010780051s to retry...","level":"warn","ts":1615908262.9892304} E 
2021-03-16T15:24:22.981095760Z {"caller":"roundtripper/roundtripper.go:167","level":"warn","ts":1615908262.9809232,"msg":"Rate limit exceeded. Waiting 53m30.019108613s to retry..."} E 
2021-03-16T15:24:19.564465972Z github.com/invoiceninja/dockerfiles I 
2021-03-16T15:23:59.036887386Z github.com/internetarchive/openlibrary I 
2021-03-16T15:23:45.484818347Z github.com/intellij-rust/intellij-rust I 
2021-03-16T15:23:25.610494587Z github.com/intel/media-driver I 
2021-03-16T15:22:33.414466331Z github.com/intel/gvt-linux I 
2021-03-16T15:22:24.609202708Z github.com/intel/compute-runtime I 
2021-03-16T15:22:13.255145868Z github.com/instructure/canvas-lms I 
2021-03-16T15:22:01.466539047Z github.com/inspec/inspec I 
2021-03-16T15:21:52.499144242Z github.com/input-output-hk/jormungandr I 
2021-03-16T15:21:43.686118103Z github.com/ingydotnet/git-subrepo I 
2021-03-16T15:21:34.424730045Z github.com/influxdata/telegraf I 
2021-03-16T15:21:24.292290198Z github.com/influxdata/influxdb I 
2021-03-16T15:21:12.104477292Z github.com/infinispan/infinispan I 
2021-03-16T15:20:59.947029413Z github.com/infection/infection I 
2021-03-16T15:20:49.530508149Z github.com/impress-org/givewp I 
2021-03-16T15:20:42.717601284Z github.com/imazen/imageflow I 
2021-03-16T15:20:29.769346213Z github.com/image-rs/image I 
2021-03-16T15:19:49.502180641Z github.com/illumos/illumos-gate I 
2021-03-16T15:19:42.897541109Z github.com/illuminate/support I 
2021-03-16T15:19:32.835321552Z github.com/illuminate/database I 
2021-03-16T15:19:18.629120783Z github.com/igraph/igraph I 
2021-03-16T15:19:07.083963112Z github.com/ifmeorg/ifme I 
2021-03-16T15:18:34.163737021Z github.com/idaholab/moose I 
2021-03-16T15:18:21.860157901Z github.com/icsharpcode/SharpZipLib I 
2021-03-16T15:18:07.972600555Z github.com/icsharpcode/ILSpy I 
2021-03-16T15:17:53.017710087Z {"ts":1615907873.0175443,"msg":"Rate limit exceeded. Retrying...","caller":"roundtripper/roundtripper.go:171","level":"warn"} E 
naveensrinivasan commented 3 years ago

I will do a PR which will hash the token and dump it into the log we're using into the log statement when it hits the rate limit. This will help us identify the number of tokens from the log.

naveensrinivasan commented 3 years ago

Yes, we are. I have been looking at it also. I will take another look. We have 4 tokens now, we are trying to 2280 GitHub Repo's and with each repo, there is N number of calls.

I increased to 6 last night.

i feel we process very few repos before we re-hit the rate limit. e.g.

2021-03-16T16:17:53.002037571Z {"msg":"Rate limit exceeded. Retrying...","caller":"roundtripper/roundtripper.go:171","ts":1615911473.0013971,"level":"warn"} E 
2021-03-16T16:17:53.002031954Z {"caller":"roundtripper/roundtripper.go:171","level":"warn","msg":"Rate limit exceeded. Retrying...","ts":1615911473.0012403} E 
2021-03-16T16:17:53.002024178Z {"caller":"roundtripper/roundtripper.go:171","ts":1615911473.0010984,"level":"warn","msg":"Rate limit exceeded. Retrying..."} E 
2021-03-16T16:17:53.001958963Z {"ts":1615911473.0002775,"caller":"roundtripper/roundtripper.go:171","msg":"Rate limit exceeded. Retrying...","level":"warn"} E 
2021-03-16T15:24:23.182137060Z {"level":"warn","caller":"roundtripper/roundtripper.go:167","msg":"Rate limit exceeded. Waiting 53m29.818030646s to retry...","ts":1615908263.1819801} E 
2021-03-16T15:24:23.052225987Z {"msg":"Rate limit exceeded. Waiting 53m29.947978265s to retry...","ts":1615908263.052036,"level":"warn","caller":"roundtripper/roundtripper.go:167"} E 
2021-03-16T15:24:23.044774464Z {"msg":"Rate limit exceeded. Waiting 53m29.955391961s to retry...","caller":"roundtripper/roundtripper.go:167","ts":1615908263.0446181,"level":"warn"} E 
2021-03-16T15:24:23.009648111Z {"level":"warn","ts":1615908263.0094867,"msg":"Rate limit exceeded. Waiting 53m29.990522824s to retry...","caller":"roundtripper/roundtripper.go:167"} E 
2021-03-16T15:24:22.989419887Z {"caller":"roundtripper/roundtripper.go:167","msg":"Rate limit exceeded. Waiting 53m30.010780051s to retry...","level":"warn","ts":1615908262.9892304} E 
2021-03-16T15:24:22.981095760Z {"caller":"roundtripper/roundtripper.go:167","level":"warn","ts":1615908262.9809232,"msg":"Rate limit exceeded. Waiting 53m30.019108613s to retry..."} E 
2021-03-16T15:24:19.564465972Z github.com/invoiceninja/dockerfiles I 
2021-03-16T15:23:59.036887386Z github.com/internetarchive/openlibrary I 
2021-03-16T15:23:45.484818347Z github.com/intellij-rust/intellij-rust I 
2021-03-16T15:23:25.610494587Z github.com/intel/media-driver I 
2021-03-16T15:22:33.414466331Z github.com/intel/gvt-linux I 
2021-03-16T15:22:24.609202708Z github.com/intel/compute-runtime I 
2021-03-16T15:22:13.255145868Z github.com/instructure/canvas-lms I 
2021-03-16T15:22:01.466539047Z github.com/inspec/inspec I 
2021-03-16T15:21:52.499144242Z github.com/input-output-hk/jormungandr I 
2021-03-16T15:21:43.686118103Z github.com/ingydotnet/git-subrepo I 
2021-03-16T15:21:34.424730045Z github.com/influxdata/telegraf I 
2021-03-16T15:21:24.292290198Z github.com/influxdata/influxdb I 
2021-03-16T15:21:12.104477292Z github.com/infinispan/infinispan I 
2021-03-16T15:20:59.947029413Z github.com/infection/infection I 
2021-03-16T15:20:49.530508149Z github.com/impress-org/givewp I 
2021-03-16T15:20:42.717601284Z github.com/imazen/imageflow I 
2021-03-16T15:20:29.769346213Z github.com/image-rs/image I 
2021-03-16T15:19:49.502180641Z github.com/illumos/illumos-gate I 
2021-03-16T15:19:42.897541109Z github.com/illuminate/support I 
2021-03-16T15:19:32.835321552Z github.com/illuminate/database I 
2021-03-16T15:19:18.629120783Z github.com/igraph/igraph I 
2021-03-16T15:19:07.083963112Z github.com/ifmeorg/ifme I 
2021-03-16T15:18:34.163737021Z github.com/idaholab/moose I 
2021-03-16T15:18:21.860157901Z github.com/icsharpcode/SharpZipLib I 
2021-03-16T15:18:07.972600555Z github.com/icsharpcode/ILSpy I 
2021-03-16T15:17:53.017710087Z {"ts":1615907873.0175443,"msg":"Rate limit exceeded. Retrying...","caller":"roundtripper/roundtripper.go:171","level":"warn"} E 

I agree. We would have to look into this.

naveensrinivasan commented 3 years ago

I will investigate and post the updates.

inferno-chromium commented 3 years ago

I will do a PR which will hash the token and dump it into the log we're using into the log statement when it hits the rate limit. This will help us identify the number of tokens from the log.

or even dump the index of token in list :)

naveensrinivasan commented 3 years ago

There is a problem

➜  scorecard git:(feat/play-with-headers) ✗ go run .  --repo=github.com/ossf/scorecard
Starting [Active]
Starting [Branch-Protection]
Starting [CI-Tests]
Starting [CII-Best-Practices]
Starting [Code-Review]
Starting [Contributors]
Starting [Frozen-Deps]
Starting [Fuzzing]
Starting [Packaging]
Starting [Pull-Requests]
Starting [SAST]
Starting [Security-Policy]
Starting [Signed-Releases]
Starting [Signed-Tags]
{"level":"info","ts":1615918078.7519577,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
Finished [Fuzzing]
Finished [CII-Best-Practices]
Finished [SAST]
Finished [Signed-Releases]
Finished [Signed-Tags]
Finished [Branch-Protection]
Finished [Contributors]
Finished [Security-Policy]
Finished [Frozen-Deps]
Finished [Packaging]
Finished [Active]
Finished [Pull-Requests]
Finished [Code-Review]
Finished [CI-Tests]

RESULTS
-------
Active: Pass 10
Branch-Protection: Fail 0
CI-Tests: Pass 10
CII-Best-Practices: Fail 10
Code-Review: Pass 10
Contributors: Pass 10
Frozen-Deps: Pass 10
Fuzzing: Fail 10
Packaging: Pass 10
Pull-Requests: Pass 10
SAST: Pass 10
Security-Policy: Pass 10
Signed-Releases: Fail 5
Signed-Tags: Fail 7

It is always using only 1 token {"level":"info","ts":1615918078.7519577,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}

naveensrinivasan commented 3 years ago

Token roundrobin

The GitHub tokens are supposed to be roundrobinned across multiple tokens.

From the investigation, it doesn't look like it is doing that.

Here is the stack trace of token Token function

goroutine 20 [running]:
runtime/debug.Stack(0x0, 0x0, 0x0)
    /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/runtime/debug/stack.go:24 +0x9f
runtime/debug.PrintStack()
    /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/runtime/debug/stack.go:16 +0x25
github.com/ossf/scorecard/roundtripper.(*RoundRobinTokenSource).Token(0xc0001bbf80, 0x28, 0x1, 0xa6b50f)
    /home/sammy/go/src/github.com/ossf/scorecard/roundtripper/roundtripper.go:64 +0x11f
golang.org/x/oauth2.(*reuseTokenSource).Token(0xc000132c60, 0x0, 0x0, 0x0)
    /home/sammy/go/pkg/mod/golang.org/x/oauth2@v0.0.0-20201203001011-0b49973bad19/oauth2.go:304 +0xaf
golang.org/x/oauth2.(*Transport).RoundTrip(0xc000132c80, 0xc0001f4b00, 0x0, 0x0, 0x0)
    /home/sammy/go/pkg/mod/golang.org/x/oauth2@v0.0.0-20201203001011-0b49973bad19/transport.go:45 +0x9e
github.com/ossf/scorecard/roundtripper.(*RateLimitRoundTripper).RoundTrip(0xc000126be8, 0xc0001f4b00, 0xc6ad27, 0xe, 0x11cf1a0)
    /home/sammy/go/src/github.com/ossf/scorecard/roundtripper/roundtripper.go:155 +0xf0
github.com/naveensrinivasan/httpcache.(*Transport).RoundTrip(0xc00039e030, 0xc0001f4b00, 0xc00039e030, 0x0, 0x0)
    /home/sammy/go/pkg/mod/github.com/naveensrinivasan/httpcache@v1.2.2/httpcache.go:214 +0x798
net/http.send(0xc0001f4b00, 0xd44760, 0xc00039e030, 0x0, 0x0, 0x0, 0xc00012e4c8, 0xc69e4f, 0x1, 0x0)
    /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/net/http/client.go:251 +0x454
net/http.(*Client).send(0xc00039e060, 0xc0001f4b00, 0x0, 0x0, 0x0, 0xc00012e4c8, 0x0, 0x1, 0xb)
    /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/net/http/client.go:175 +0xff
net/http.(*Client).do(0xc00039e060, 0xc0001f4b00, 0x0, 0x0, 0x0)
    /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/net/http/client.go:717 +0x45f
net/http.(*Client).Do(...)
    /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/net/http/client.go:585
github.com/google/go-github/v32/github.(*Client).Do(0xc000164dc0, 0xd52a98, 0xc00011a000, 0xc0001f4a00, 0xaf50e0, 0xc000126c90, 0x0, 0x0, 0x0)
    /home/sammy/go/pkg/mod/github.com/google/go-github/v32@v32.1.0/github/github.go:537 +0x239
github.com/google/go-github/v32/github.(*RepositoriesService).ListCommits(0xc000164e48, 0xd52a98, 0xc00011a000, 0xc0001522e3, 0x4, 0xc0001522e8, 0x9, 0xc000325110, 0xc000100050, 0x110000c000080b88, ...)
    /home/sammy/go/pkg/mod/github.com/google/go-github/v32@v32.1.0/github/repos_commits.go:140 +0x1ee
github.com/ossf/scorecard/checks.IsActive(0xd52a98, 0xc00011a000, 0xc000164dc0, 0xc00012e4a8, 0xc00039e060, 0xc0001522e3, 0x4, 0xc0001522e8, 0x9, 0xc000255630, ...)
    /home/sammy/go/src/github.com/ossf/scorecard/checks/active.go:31 +0xd5
github.com/ossf/scorecard/checker.(*Runner).Run(0xc00013de90, 0xc9e448, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
    /home/sammy/go/src/github.com/ossf/scorecard/checker/checker.go:54 +0x1b5
github.com/ossf/scorecard/pkg.RunScorecards.func1(0xc00011b580, 0xd52a98, 0xc00011a000, 0xc000164dc0, 0xc00012e4a8, 0xc00039e060, 0xc0001522e3, 0x4, 0xc0001522e8, 0x9, ...)
    /home/sammy/go/src/github.com/ossf/scorecard/pkg/scorecard.go:110 +0xd4
created by github.com/ossf/scorecard/pkg.RunScorecards
    /home/sammy/go/src/github.com/ossf/scorecard/pkg/scorecard.go:107 +0x353

The Token func (which supplies the token) https://github.com/ossf/scorecard/blob/main/roundtripper/roundtripper.go#L58 is being called by

github.com/ossf/scorecard/roundtripper.(*RoundRobinTokenSource).Token(0xc0001bbf80, 0x28, 0x1, 0xa6b50f)
    /home/sammy/go/src/github.com/ossf/scorecard/roundtripper/roundtripper.go:64 +0x11f
golang.org/x/oauth2.(*reuseTokenSource).Token(0xc000132c60, 0x0, 0x0, 0x0)
    /home/sammy/go/pkg/mod/golang.org/x/oauth2@v0.0.0-20201203001011-0b49973bad19/oauth2.go:304 +0xaf

https://github.com/golang/oauth2/blob/cd4f82c27b84ec2516d3c38a5b5acaf8823e2702/oauth2.go#L295-310

From the :point_up: function it looks like the token is cached and never asked for subsequent calls.

So your assumption is right we are only using 1 token and that is the reason it is slow.

naveensrinivasan commented 3 years ago

We will have to come up with a strategy to use multiple tokens

inferno-chromium commented 3 years ago

Token roundrobin

The GitHub tokens are supposed to be roundrobinned across multiple tokens.

From the investigation, it doesn't look like it is doing that.

Here is the stack trace of token Token function

goroutine 20 [running]:
runtime/debug.Stack(0x0, 0x0, 0x0)
  /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/runtime/debug/stack.go:24 +0x9f
runtime/debug.PrintStack()
  /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/runtime/debug/stack.go:16 +0x25
github.com/ossf/scorecard/roundtripper.(*RoundRobinTokenSource).Token(0xc0001bbf80, 0x28, 0x1, 0xa6b50f)
  /home/sammy/go/src/github.com/ossf/scorecard/roundtripper/roundtripper.go:64 +0x11f
golang.org/x/oauth2.(*reuseTokenSource).Token(0xc000132c60, 0x0, 0x0, 0x0)
  /home/sammy/go/pkg/mod/golang.org/x/oauth2@v0.0.0-20201203001011-0b49973bad19/oauth2.go:304 +0xaf
golang.org/x/oauth2.(*Transport).RoundTrip(0xc000132c80, 0xc0001f4b00, 0x0, 0x0, 0x0)
  /home/sammy/go/pkg/mod/golang.org/x/oauth2@v0.0.0-20201203001011-0b49973bad19/transport.go:45 +0x9e
github.com/ossf/scorecard/roundtripper.(*RateLimitRoundTripper).RoundTrip(0xc000126be8, 0xc0001f4b00, 0xc6ad27, 0xe, 0x11cf1a0)
  /home/sammy/go/src/github.com/ossf/scorecard/roundtripper/roundtripper.go:155 +0xf0
github.com/naveensrinivasan/httpcache.(*Transport).RoundTrip(0xc00039e030, 0xc0001f4b00, 0xc00039e030, 0x0, 0x0)
  /home/sammy/go/pkg/mod/github.com/naveensrinivasan/httpcache@v1.2.2/httpcache.go:214 +0x798
net/http.send(0xc0001f4b00, 0xd44760, 0xc00039e030, 0x0, 0x0, 0x0, 0xc00012e4c8, 0xc69e4f, 0x1, 0x0)
  /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/net/http/client.go:251 +0x454
net/http.(*Client).send(0xc00039e060, 0xc0001f4b00, 0x0, 0x0, 0x0, 0xc00012e4c8, 0x0, 0x1, 0xb)
  /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/net/http/client.go:175 +0xff
net/http.(*Client).do(0xc00039e060, 0xc0001f4b00, 0x0, 0x0, 0x0)
  /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/net/http/client.go:717 +0x45f
net/http.(*Client).Do(...)
  /nix/store/97jqpixykvdf2awdfymn4xsci38v34yc-go-1.16/share/go/src/net/http/client.go:585
github.com/google/go-github/v32/github.(*Client).Do(0xc000164dc0, 0xd52a98, 0xc00011a000, 0xc0001f4a00, 0xaf50e0, 0xc000126c90, 0x0, 0x0, 0x0)
  /home/sammy/go/pkg/mod/github.com/google/go-github/v32@v32.1.0/github/github.go:537 +0x239
github.com/google/go-github/v32/github.(*RepositoriesService).ListCommits(0xc000164e48, 0xd52a98, 0xc00011a000, 0xc0001522e3, 0x4, 0xc0001522e8, 0x9, 0xc000325110, 0xc000100050, 0x110000c000080b88, ...)
  /home/sammy/go/pkg/mod/github.com/google/go-github/v32@v32.1.0/github/repos_commits.go:140 +0x1ee
github.com/ossf/scorecard/checks.IsActive(0xd52a98, 0xc00011a000, 0xc000164dc0, 0xc00012e4a8, 0xc00039e060, 0xc0001522e3, 0x4, 0xc0001522e8, 0x9, 0xc000255630, ...)
  /home/sammy/go/src/github.com/ossf/scorecard/checks/active.go:31 +0xd5
github.com/ossf/scorecard/checker.(*Runner).Run(0xc00013de90, 0xc9e448, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
  /home/sammy/go/src/github.com/ossf/scorecard/checker/checker.go:54 +0x1b5
github.com/ossf/scorecard/pkg.RunScorecards.func1(0xc00011b580, 0xd52a98, 0xc00011a000, 0xc000164dc0, 0xc00012e4a8, 0xc00039e060, 0xc0001522e3, 0x4, 0xc0001522e8, 0x9, ...)
  /home/sammy/go/src/github.com/ossf/scorecard/pkg/scorecard.go:110 +0xd4
created by github.com/ossf/scorecard/pkg.RunScorecards
  /home/sammy/go/src/github.com/ossf/scorecard/pkg/scorecard.go:107 +0x353

The Token func (which supplies the token) https://github.com/ossf/scorecard/blob/main/roundtripper/roundtripper.go#L58 is being called by

github.com/ossf/scorecard/roundtripper.(*RoundRobinTokenSource).Token(0xc0001bbf80, 0x28, 0x1, 0xa6b50f)
  /home/sammy/go/src/github.com/ossf/scorecard/roundtripper/roundtripper.go:64 +0x11f
golang.org/x/oauth2.(*reuseTokenSource).Token(0xc000132c60, 0x0, 0x0, 0x0)
  /home/sammy/go/pkg/mod/golang.org/x/oauth2@v0.0.0-20201203001011-0b49973bad19/oauth2.go:304 +0xaf

https://github.com/golang/oauth2/blob/cd4f82c27b84ec2516d3c38a5b5acaf8823e2702/oauth2.go#L295-310

From the ☝️ function it looks like the token is cached and never asked for subsequent calls.

So your assumption is right we are only using 1 token and that is the reason it is slow.

yes right, another token is not used, and also it sleeps for a long time after rate limit is reached with one token (rather than trying another token).

naveensrinivasan commented 3 years ago

The reason for another token is not being used because the token isn't expiring.

https://github.com/golang/oauth2/blob/cd4f82c27b84ec2516d3c38a5b5acaf8823e2702/token.go#L126

GitHub oauth2 tokens don't have time-based expiration. So that is the reason it doesn't refresh to new tokens.

AFAIK we would need multiple processes handling requests to solve this. I don't see any other way to solve this.

edit - Token never expires so the token source is never asked for a new token. https://github.com/golang/oauth2/blob/cd4f82c27b84ec2516d3c38a5b5acaf8823e2702/token.go#L45

naveensrinivasan commented 3 years ago

I was able to reproduce this easily.

➜  scorecard git:(feat/play-with-headers) ✗ export USE_BLOB_CACHE=1
➜  scorecard git:(feat/play-with-headers) ✗ ./cron/cron.sh
github.com/18F/identity-idp
{"level":"info","ts":1615923863.0318873,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/1N3/Sn1per
{"level":"info","ts":1615923879.7010963,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/24pullrequests/24pullrequests
{"level":"info","ts":1615923895.2188172,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/AElfProject/AElf
{"level":"info","ts":1615923912.8261182,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/ARMmbed/mbed-os
{"level":"info","ts":1615923929.1262758,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/ARMmbed/mbedtls
{"level":"info","ts":1615923959.6104577,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/Activiti/Activiti
{"level":"info","ts":1615923976.672196,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/AdguardTeam/AdguardFilters
{"level":"info","ts":1615923994.6448145,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/AlchemyCMS/alchemy_cms
{"level":"info","ts":1615924007.311044,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/AliasIO/wappalyzer
{"level":"info","ts":1615924024.244622,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/Amanieu/parking_lot
{"level":"info","ts":1615924043.902877,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/AntennaPod/AntennaPod
{"level":"info","ts":1615924065.0364234,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/Anuken/Mindustry
{"level":"info","ts":1615924088.9289727,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/ApolloAuto/apollo
{"level":"info","ts":1615924105.2950606,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/Ardour/ardour
{"level":"info","ts":1615924313.61517,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/ArduPilot/ardupilot
{"level":"info","ts":1615924330.579817,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/AssetSync/asset_sync
{"level":"info","ts":1615924348.134791,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/AsyncHttpClient/async-http-client
{"level":"info","ts":1615924368.039443,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/AudioKit/AudioKit
{"level":"info","ts":1615924386.4050689,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
github.com/AugurProject/augur
{"level":"info","ts":1615924406.5470684,"caller":"roundtripper/roundtripper.go:62","msg":"using token 1 and the len of the tokens are 2"}
{"level":"warn","ts":1615924418.6501276,"caller":"roundtripper/roundtripper.go:170","msg":"Rate limit exceeded. Waiting 49m17.349892294s to retry..."}
{"level":"warn","ts":1615924418.8244143,"caller":"roundtripper/roundtripper.go:170","msg":"Rate limit exceeded. Waiting 49m17.175597243s to retry..."}
{"level":"warn","ts":1615924418.9265325,"caller":"roundtripper/roundtripper.go:170","msg":"Rate limit exceeded. Waiting 49m17.073478613s to retry..."}
{"level":"warn","ts":1615924419.2328804,"caller":"roundtripper/roundtripper.go:170","msg":"Rate limit exceeded. Waiting 49m16.767132845s to retry..."}
inferno-chromium commented 3 years ago

We can cause the tokens to expire using

        return &oauth2.Token{
                AccessToken: r.AccessTokens[index],
+               Expiry: time.Now().Add(time.Second),  // Expire to iterate b/w tokens.
        }, nil

and remove the sleep when rate limit happens to let it iterate

        if remaining <= 0 {
-               reset, err := strconv.Atoi(resp.Header.Get("X-RateLimit-Reset"))
-               if err != nil {
-                       return resp, nil
-               }
-
-               duration := time.Until(time.Unix(int64(reset), 0))
-               gh.Logger.Warnf("Rate limit exceeded. Waiting %s to retry...", duration)
-
-               // Retry
-               time.Sleep(duration)
-               gh.Logger.Warnf("Rate limit exceeded. Retrying...")
+               gh.Logger.Warnf("Rate limit exceeded on token. Retrying.")
                return gh.RoundTrip(r)

This works. But then once all tokens expire, we would just be constantly DoSing the api (since no sleep). Any other ideas or thoughts?

naveensrinivasan commented 3 years ago

The goal is to come up with a Producer/Consumer pattern that will calculate the next best token to use and if all the tokens are out of their API limit and then block.

[]string{"token1","token2"} 

func givemeNextToken() string {
// iterate through all the tokens and return the best token based on the GitHub API check results
// if all of them don't have any more API requests and sleep for the duration and BLOCK
}

data := make(chan string)

// producer of the tokens
go func() {
    for {
        i = givemeNextToken(i)
        data <- i
    }
}()

 if remaining <= 0 { 
    // reset
   token := <- data
  // reset the gh.InnerTransport = NewTransport(context.TODO(), gh.Logger) with this new token
 }

Ideally, this should work. I would like to spend some time trying this. I will make a detailed write-up with some implementation.

inferno-chromium commented 3 years ago

The goal is to come up with a Producer/Consumer pattern that will calculate the next best token to use and if all the tokens are out of their API limit and then block.

[]string{"token1","token2"} 

func givemeNextToken() string {
// iterate through all the tokens and return the best token based on the GitHub API check results
// if all of them don't have any more API requests and sleep for the duration and BLOCK
}

data := make(chan string)

// producer of the tokens
go func() {
    for {
        i = givemeNextToken(i)
        data <- i
    }
}()

 if remaining <= 0 { 
    // reset
   token := <- data
  // reset the gh.InnerTransport = NewTransport(context.TODO(), gh.Logger) with this new token
 }

Ideally, this should work. I would like to spend some time trying this. I will make a detailed write-up with some implementation.

Thanks!

naveensrinivasan commented 3 years ago
➜  scorecard git:(feat/play-with-headers) ✗ ./scorecard  --repo=github.com/ossf/scorecard
Starting [Active]
Starting [Branch-Protection]
Starting [CI-Tests]
Starting [CII-Best-Practices]
Starting [Code-Review]
Starting [Contributors]
Starting [Frozen-Deps]
Starting [Fuzzing]
Starting [Packaging]
Starting [Pull-Requests]
Starting [SAST]
Starting [Security-Policy]
Starting [Signed-Releases]
Starting [Signed-Tags]
{"level":"info","ts":1616011689.6091163,"caller":"roundtripper/roundtripper.go:63","msg":"using token 1 and the len of the tokens are 2 and in the c is 1"}
{"level":"info","ts":1616011689.6918664,"caller":"roundtripper/roundtripper.go:63","msg":"using token 0 and the len of the tokens are 2 and in the c is 2"}
{"level":"info","ts":1616011689.7047877,"caller":"roundtripper/roundtripper.go:63","msg":"using token 1 and the len of the tokens are 2 and in the c is 3"}
{"level":"info","ts":1616011689.7564862,"caller":"roundtripper/roundtripper.go:63","msg":"using token 0 and the len of the tokens are 2 and in the c is 4"}
{"level":"info","ts":1616011689.7568052,"caller":"roundtripper/roundtripper.go:63","msg":"using token 1 and the len of the tokens are 2 and in the c is 5"}
{"level":"info","ts":1616011689.7582355,"caller":"roundtripper/roundtripper.go:63","msg":"using token 0 and the len of the tokens are 2 and in the c is 6"}
Finished [Security-Policy]
Finished [Fuzzing]
Finished [SAST]
Finished [Frozen-Deps]
Finished [Signed-Releases]
Finished [Packaging]
Finished [CII-Best-Practices]
Finished [Signed-Tags]
Finished [Branch-Protection]
Finished [Contributors]
Finished [Code-Review]
Finished [CI-Tests]
Finished [Pull-Requests]
Finished [Active]

RESULTS
-------
Active: Pass 10
Branch-Protection: Fail 0
CI-Tests: Pass 10
CII-Best-Practices: Fail 10
Code-Review: Pass 10
Contributors: Pass 10
Frozen-Deps: Pass 10
Fuzzing: Fail 10
Packaging: Pass 10
Pull-Requests: Pass 10
SAST: Pass 10
Security-Policy: Pass 10
Signed-Releases: Fail 5
Signed-Tags: Fail 7

I think we might have a solution.

Here is an example it is switching across 2 tokens.

{"level":"info","ts":1616011689.6091163,"caller":"roundtripper/roundtripper.go:63","msg":"using token 1 and the len of the tokens are 2 and in the c is 1"}
{"level":"info","ts":1616011689.6918664,"caller":"roundtripper/roundtripper.go:63","msg":"using token 0 and the len of the tokens are 2 and in the c is 2"}
{"level":"info","ts":1616011689.7047877,"caller":"roundtripper/roundtripper.go:63","msg":"using token 1 and the len of the tokens are 2 and in the c is 3"}
{"level":"info","ts":1616011689.7564862,"caller":"roundtripper/roundtripper.go:63","msg":"using token 0 and the len of the tokens are 2 and in the c is 4"}
{"level":"info","ts":1616011689.7568052,"caller":"roundtripper/roundtripper.go:63","msg":"using token 1 and the len of the tokens are 2 and in the c is 5"}
{"level":"info","ts":1616011689.7582355,"caller":"roundtripper/roundtripper.go:63","msg":"using token 0 and the len of the tokens are 2 and in the c is 6"}

@inferno-chromium FYI...