OpenCHAMI / bss

MIT License
1 stars 2 forks source link

[BUG] BSS Logging is insufficient for troubleshooting #27

Open alexlovelltroy opened 3 months ago

alexlovelltroy commented 3 months ago

When debugging containers for the quickstart, I found log lines like those below:

2024/04/15 18:49:32 failed to initialize auth token: failed to parse JWK set: failed to unmarshal JWK set: error reading token: EOF
<--snip-->
2024/04/15 18:50:17 Failed to obtain client credentials and token: no access token found
2024/04/15 18:50:22 Attempting to obtain access token (attempt 10/10)
<--snip-->
2024/04/15 18:50:27 Access to SM service http://smd:27779 failed: Failed refreshing JWT: Failed to get access token: Exhausted 10 attempts at obtaining client credentials and token

Then bss died.

Clearly there's a problem with bss attempting to fetch a token for use in querying smd, but I can't tell enough information to start troubleshooting the issue.

What endpoint was used to obtain the EOF JWK?
Was the download successful, but the file was empty? Was there a problem with name resolution? Was the server on the other end slow? BSS has clearly exited because it believes that further attempts to start up will be unsuccessful. Why?

What about using SMD? Did SMD reject any requests due to failed JWT checks? If so which ones? Were there messages associated?

davidallendj commented 3 months ago

This seems to be caused by opaal failing to fetch the JWKS from hydra if hydra fails to start whenever BSS tries making a request to the /keys endpoint. When this happens, it doesn't look like any of the code paths in opaal are returning an error or redirect (which is not suppose to happen). So in addition to making BSS's logging more informative, we also need to improve logging messages in opaal as well.