Replacing results in metriq throws HttpClientError

karlaspuldaro commented 1 year ago

Replacing results in metriq.info via metriq-client API results in Unauthorized error when making the request client.http.delete(f"/result/{result_id}/")

See log from automated failed job:

Deleting qiskit version 0.44.1 result from submission 595...
Traceback (most recent call last):
  File "/home/runner/work/submit-metriq/submit-metriq/src/run_experiment_steps.py", line 30, in <module>
    delete_submission_results(submission_id, r_version)
  File "/home/runner/work/submit-metriq/submit-metriq/src/preprocessing.py", line 69, in delete_submission_results
    client.http.delete(f"/result/{result_id}/")
  File "/home/runner/work/submit-metriq/submit-metriq/.tox/py38/lib/python3.8/site-packages/tea_client/http.py", line 306, in delete
    return self.request(
  File "/home/runner/work/submit-metriq/submit-metriq/.tox/py38/lib/python3.8/site-packages/tea_client/http.py", line 179, in request
    raise errors.HttpClientError(message, response=self.response)
tea_client.errors.HttpClientError: HttpClientError([40](https://github.com/karlaspuldaro/submit-metriq/actions/runs/6714275264/job/18247252560#step:5:41)1: Unauthorized)

karlaspuldaro commented 1 year ago

For some reason, sometimes we need to refresh the API token in between endpoint calls. In this case, I was able to recreate the Unauthorized HttpClientError when reaching the same line as shown in the log above. Running export METRIQ_TOKEN={token_value} before the execution fixed the issue upon delete. We must do something similar in CI.

FYI @vprusso @WrathfulSpatula is this expected/known issue?

WrathfulSpatula commented 1 year ago

This would be a bug, firstly, if this is specifically a token generated in the user profile settings. However, front end session tokens, like through username/password login, have a timed expiration.

It's possible that a timed expiration is accidentally being applied to tokens generated through the user profile settings. I'll be able to look into this tomorrow, (I promise, and I haven't forgotten about the box plot, with which I have procrastinated extremely, apologies. I prioritized other development work, but we haven't forgotten, and I'm sorry again!)

karlaspuldaro commented 1 year ago

@WrathfulSpatula thanks a lot! In this specific case our client instance makes a GET request for all results from a submission, then finds the one to be updated, then calls delete on a result item. That's when we get the unauthorized error (sometimes). Refreshing the token fixed the issue when it happened.

WrathfulSpatula commented 1 year ago

@karlaspuldaro Firstly, the gist is that I don't think the problem is a bug that sets an expiration time on the metriq-client token, if you generate a token in the user profile settings view of the front end. I'll try to decode an actual token, with debugger tools, to further confirm this. Please let me know, if you're getting a token from somewhere besides the user profile settings view in the front end. (It's theoretically possible that someone could do so, like by using a web front end token, which actually can be retrieved with a browser debugger or the wrong API route to our back end.)

The technical details might not be important for me to explain, to fix your issue, but, for sake of the issue ticket as a reference, I'll explain the code path. Below, I'll explain the logical flow, from the point of the method called on clicking the UI button in the user profile front end view, which makes a request to the back end.

handleGenerateOnClick() at line 41 of the Token.js view makes a request to the route /api/token, in the back end. (Pardon, GitHub just went partially down, as of a couple of minutes ago, and this is actually the last permalink I tried to add to this issue ticket comment, so I temporarily can't link it.)

That front end method, makes a public GET request for a metriq-client token mapped to a controller method, here: https://github.com/unitaryfund/metriq-api/blob/1514212e633698f07d67c58950bfec27af7eebfc/metriq-api/api-routes.js#L35

That's basically the entry point to our back end code. Because it's mapped to the controller method, the back end enters that method: https://github.com/unitaryfund/metriq-api/blob/1514212e633698f07d67c58950bfec27af7eebfc/metriq-api/controller/accountController.js#L100

As you can see in that controller method, we enter a service method, in userService.js: https://github.com/unitaryfund/metriq-api/blob/1514212e633698f07d67c58950bfec27af7eebfc/metriq-api/service/userService.js#L201

Notice (at least when GitHub is back up and fully functioning,) that generateWebJwt() has a true argument in third position, and generateClientJwt() has a false argument in third position. Immediately below, in that same service file, generateJwt() sets an expiration if the argument is true, and doesn't add an expiration to the token if false. The two arguments appear to be correct, for respective cases of metriq-client token vs. front end session token.

If I can decode token fields with debugging tools, we can basically further prove this is the correct stack trace, above. Again, please confirm that you're getting the token through the user profile view of the Metriq web front end.

WrathfulSpatula commented 1 year ago

Also, it's worth pointing out: if you hold an existing metriq-client token, and then you generate a new one in the user profile settings view of the front end, then all previously generated metriq-client tokens for your account are invalidated, as the text in the front end indicates.

So, if you have a token that's valid, which you're immediately using with client code itself, and then you generate any new metriq-client token in the user profile view, then you must replace all old textual code references to any old tokens, with the single latest token. (Sorry that this is jargony, but I can break it down more simply, if necessary.)

karlaspuldaro commented 1 year ago

@WrathfulSpatula thank you so much for the detailed explanation of the token flow, this is very helpful. The token we use locally and in CI is one created through the UI 5 months ago, so I can confirm the last scenario does not apply.

Another theory, and I think this is very likely, this is in an environmental variable issue in our CI, and how the tox environment we create accesses the token from the system. If the token variable is empty, the API still allows creating a client instance and, I assume, also allows making some public GET requests, eg. to submissions. Although on delete is when we get the 401 error, and I think this is because we have an empty token in the system environment variable.

WrathfulSpatula commented 1 year ago

I'm not personally familiar with tox, but that makes quite a bit of sense, as you explain your own "user code" base. I'm keeping an eye on the issue ticket, and I'll take another look at this on Monday, from the Metriq side, if not sooner. If you have updates on your end, I'm all ears.

karlaspuldaro commented 1 year ago

@WrathfulSpatula it might be hard to debug CI but I will investigate this on a draft PR soon

karlaspuldaro commented 11 months ago

@WrathfulSpatula it was a CI problem indeed, the API token was not being read at some point.

Btw, when you have a chance could you please review our submissions - 661 and 662 , they're been submitted a few months ago.

qiskit-community / qiskit-metriq

Replacing results in metriq throws HttpClientError #39