change-metrics / monocle

Monocle helps teams and individual to better organize daily duties and to detect anomalies in the way changes are produced and reviewed.
https://changemetrics.io
GNU Affero General Public License v3.0
371 stars 58 forks source link

Error while running RequestRateLimit with Github provider #1124

Open hugoarroyo opened 1 month ago

hugoarroyo commented 1 month ago

I followed to the letter your k8s deployment https://github.com/change-metrics/monocle/blob/master/k8s/README.md

I'm using this versions quay.io/change-metrics/monocle:1.11.1

I used the github provider

kind: ConfigMap
metadata:
  name: monocle-config
  labels:
    app.kubernetes.io/part-of: monocle
data:
  config.yaml: |+
    workspaces:
      - name: marketing
        crawlers: 
          - name: marketing-crawler
            update_since: "2024-07-22"
            provider:
              github_organization : xxxxxxxx
              github_repositories:
                - xxxxxxxxxxxxx
              github_token: GITHUB_TOKEN 

I validated my github token and added to the secret kubectl create secret generic monocle-secrets --from-file=.secrets -n monocle

but when the crawler starts, I always see this error in the log

2024-07-24 01:20:24 WARNING Lentille.GraphQL:246: Could not fetch the current rate limit {"index":"cbi-marketing","crawler":"marketing-crawler","stream":"TaskDatas"}
2024-07-24 01:20:24 INFO    Macroscope.Worker:204: Posting documents {"index":"marketing","crawler":"marketing-crawler","stream":"TaskDatas","count":1}
2024-07-24 01:20:24 WARNING Macroscope.Worker:167: Stream produced a fatal error {"index":"cbi-marketing","crawler":"marketing-crawler","stream":"TaskDatas","err":["2024-07-24T01:20:24.050374029Z",{"contents":{"fetch_error":"parse failure: Error in $: Failed reading: not a valid json value at '<html><body><h1>400Badrequest<h1>'","request":{"body":"{\"operationName\":\"GetRateLimit\",\"query\":\"\\n    query GetRateLimit  {\\n      rateLimit {\\n        used\\n        remaining\\n        resetAt\\n      }\\n    }\\n  \",\"variables\":null}","resp":"<html><body><h1>400 Bad request</h1>\nYour browser sent an invalid request.\n</body></html>\n"}},"tag":"RateLimitInfoError"}]}
2024-07-24 01:20:24 INFO    Macroscope.Worker:183: Looking for oldest entity {"index":"cbi-marketing","crawler":"cbi-marketing-crawler","stream":"Changes","offset":0}
2024-07-24 01:20:24 INFO    Macroscope.Worker:199: Processing {"index":"marketing","crawler":"marketing-crawler","stream":"Changes","entity":{"contents":"xxxxxxxxx","tag":"Project"},"age":"2024-07-22T00:00:00Z"}

Thanks

TristanCacqueray commented 1 month ago

Perhaps the API returns 400 Bad request on auth error. Could you check the content of /etc/monocle-secrets/.secrets in the pod to verify the GITHUB_TOKEN is correct?

hugoarroyo commented 1 month ago

I verified, and the token is there, and is valid since I can use it to connect to GitHub graphql api using thunder client. but the crawler process returns

Stream produced a fatal error {"index":"cbi-marketing","crawler":"marketing-crawler","stream":"TaskDatas","err":["2024-07-24T01:20:24.050374029Z",{"contents":{"fetch_error":"parse failure: Error in $: Failed reading: not a valid json value at '<html><body><h1>400Badrequest<h1>'","request":{"body":"{\"operationName\":\"GetRateLimit\",\"query\":\"\\n query GetRateLimit {\\n rateLimit {\\n used\\n remaining\\n resetAt\\n }\\n }\\n \",\"variables\":null}","resp":"<html><body><h1>400 Bad request</h1>\nYour browser sent an invalid request.\n</body></html>\n"}},"tag":"RateLimitInfoError"}]}