ethz-spylab / satml-llm-ctf

Code used to run the platform for the LLM CTF colocated with SaTML 2024
https://ctf.spylab.ai
MIT License
23 stars 5 forks source link

Getting 401 "division by zero" on /api/v1/defense/{id}/evaluate_utility #11

Closed KrystofM closed 9 months ago

KrystofM commented 9 months ago

Not sure what happens, probably while calculating some average, I get divison by zero error.

dpaleka commented 9 months ago

What's the ID of the defense for which this happens?

KrystofM commented 9 months ago

After some exploration, found out the bug has to do with leaving an "invalid" api key string in the request.

The following request gives the 401:

curl -X 'POST' \
  'https://ctf.spylab.ai/api/v1/defense/657a4c97e345b2db74f14358/evaluate-utility' \
  -H 'accept: application/json' \
  -H 'X-API-Key: XXX' \
  -H 'Content-Type: application/json' \
  -d '{
  "api_keys": {
     "openai":""
  },
  "model": "openai/gpt-3.5-turbo-1106",
  "small": true
}'

Or leaving any other invalid api key, like "YOUR KEY" from the example; probably just some more informing response, than "division by zero", would be appropriate to solve the issue.

dpaleka commented 9 months ago

Thanks for helping! It is a bug indeed that we do not return an informative error here; in fact the same will probably happen if e.g. the OpenAI account is blocked. I'll try to fix this soon

dpaleka commented 9 months ago

Essentially we should just return avg_share_of_failed_queries=1 instead of an error, I guess

KrystofM commented 9 months ago

I would say returning an informative error with a 401 would be the sound course of action in this case. If calling OpenAI returns a 401, the message should be forwarded on the endpoint. More generally if OpenAI returns a 4xx error on the first request ( or on the first request that went through, in case of parallel load on the api ), I would just throw error and forward the given message. The proportion of failed requests in that case will always be =1.

dpaleka commented 9 months ago

This should be fixed now. Leaving open in case any new issues arise; otherwise will close in a few days.