Closed MagicRB closed 4 months ago
The error we saw, seems to be coming from the token generation code, not from renewal being late.
Thanks for investigating.
@Mic92 was the error ephemeral? did you observe it within the same 10 minute time span again? without a buildbot restart?
I found https://buildbot.thalheim.io/#/builders/42/builds/360 also oh wait that's the same failure lol
I haven't and https://buildbot.thalheim.io/#/builders/42/builds/360 might be also just before I deployed the fixed version.
you mean #179 ?
Yes.
That's a separate fix though, I don't think it is related to this issue sadly
Alright. Let's take it easy and mark it as experimental until we figured out what is causing it: https://github.com/nix-community/buildbot-nix/pull/185
I'm staring at the code and I don't see it. I propose we ship a debug change, which dumps the expiration information into the logfile if an exception gets thrown from GitHub. Because I'm beginning to think GitHub is the one who's not cooperating. I really do not see how these 5 lines of code could be producing an incorrect exp
value
jwt_iat_drift: timedelta = timedelta(seconds=60)
now: datetime = datetime.now(tz=UTC)
iat: datetime = now - jwt_iat_drift
exp: datetime = iat + lifetime
jwt_payload = {
"iat": int(iat.timestamp()),
"exp": int(exp.timestamp()),
"iss": str(app_id),
}
return (jwt_payload, exp)
Do you need help reproducing this log, i.e. more logging?
Yeah, it would be nice to see what actually happens and why, I'm formulating a patch so that it's clearer what I'm looking for, give me a second
completely untested, but something along these lines, ill try to deploy it to my instance
diff --git a/buildbot_nix/github/installation_token.py b/buildbot_nix/github/installation_token.py
index da6f9f9..e05b6d6 100644
--- a/buildbot_nix/github/installation_token.py
+++ b/buildbot_nix/github/installation_token.py
@@ -3,6 +3,8 @@ from datetime import UTC, datetime, timedelta
from pathlib import Path
from typing import Any
+from twisted.logger import Logger
+
from buildbot_nix.common import (
HttpResponse,
atomic_write_file,
@@ -12,6 +14,8 @@ from buildbot_nix.common import (
from .jwt_token import JWTToken
from .repo_token import RepoToken
+tlog = Logger()
+
class InstallationToken(RepoToken):
GITHUB_TOKEN_LIFETIME: timedelta = timedelta(minutes=60)
@@ -27,12 +31,24 @@ class InstallationToken(RepoToken):
def _create_installation_access_token(
jwt_token: JWTToken, installation_id: int
) -> HttpResponse:
- return http_request(
- f"https://api.github.com/app/installations/{installation_id}/access_tokens",
- data={},
- headers={"Authorization": f"Bearer {jwt_token.get()}"},
- method="POST",
- )
+ last_exception: Exception
+
+ for retry in range(3):
+ try:
+ return http_request(
+ f"https://api.github.com/app/installations/{installation_id}/access_tokens",
+ data={},
+ headers={"Authorization": f"Bearer {jwt_token.get()}"},
+ method="POST",
+ )
+ break
+ except Exception as exception:
+ tlog.warn(
+ f"Getting an installation token from GitHub failed, exp: {jwt_token.exp}"
+ )
+ last_exception = exception
+
+ raise last_exception
@staticmethod
def _generate_token(
diff --git a/buildbot_nix/github/jwt_token.py b/buildbot_nix/github/jwt_token.py
index 79fe56b..658922b 100644
--- a/buildbot_nix/github/jwt_token.py
+++ b/buildbot_nix/github/jwt_token.py
@@ -26,14 +26,14 @@ class JWTToken(RepoToken):
self.app_private_key = app_private_key
self.lifetime = lifetime
- self.token, self.expiration = JWTToken.generate_token(
+ self.token, self.expiration, self.exp = JWTToken.generate_token(
self.app_id, self.app_private_key, lifetime
)
@staticmethod
def generate_token(
app_id: int, app_private_key: str, lifetime: timedelta
- ) -> tuple[str, datetime]:
+ ) -> tuple[str, datetime, int]:
def build_jwt_payload(
app_id: int, lifetime: timedelta
) -> tuple[dict[str, Any], datetime]:
@@ -66,7 +66,7 @@ class JWTToken(RepoToken):
json_headers = json.dumps({"alg": "RS256", "typ": "JWT"}).encode("utf-8")
encoded_jwt_parts = f"{base64url(json_headers)}.{base64url(jwt_payload)}"
encoded_mac = rs256_sign(encoded_jwt_parts, app_private_key)
- return (f"{encoded_jwt_parts}.{encoded_mac}", expiration)
+ return (f"{encoded_jwt_parts}.{encoded_mac}", expiration, jwt["exp"])
# installations = paginated_github_request("https://api.github.com/app/installations?per_page=100", generated_jwt)
@@ -74,7 +74,7 @@ class JWTToken(RepoToken):
def get(self) -> str:
if datetime.now(tz=UTC) - self.expiration > self.lifetime * 0.8:
- self.token, self.expiration = JWTToken.generate_token(
+ self.token, self.expiration, self.exp = JWTToken.generate_token(
self.app_id, self.app_private_key, self.lifetime
)
I copied it to a branch: https://github.com/nix-community/buildbot-nix/compare/debug-jwt?expand=1
I've observed random failures, where GitHub would deny the JWT or installation tokens, due to them being expired or being improperly generated. @Mic92 has ran into it too.
The code needs a checkup.