cg-dot / oai-reverse-proxy

A fork of https://gitgud.io/khanon/oai-reverse-proxy
22 stars 10 forks source link

"No GCP keys available for model claude-3-opus@20240229" #1

Closed max14354 closed 3 months ago

max14354 commented 3 months ago

I receive this error when trying to prompt the reverse proxy, even though all claude models were enabled on Vertex AI and Opus seems to work when prompted by their provided Notebook. The only model that seems to work is basic Sonnet so I know the key is working but doesn't seem to have access to any model other than Sonnet 3.0.

This error appears when running npm run start: {"level":50,"time":1721645056545,"pid":41536,"module":"server","module":"key-checker","service":"gcp","key":"gcp-97d26037","error":"fetch failed","msg":"Network error while checking key; trying this key again in a minute."}

"gcp-claude": { "usage": "0 tokens", "activeKeys": 1, "revokedKeys": 0, "sonnetKeys": 1, "sonnet35Keys": 0, "haikuKeys": 0, "proomptersInQueue": 0, "estimatedQueueTime": "no wait" },

cg-dot commented 3 months ago

One possibility is that the region you've set doesn't support the model you're trying to use. When attempting to use a model in an unsupported region, vertex ai api may return 500 Internal Server Error.

You can add LOG_LEVEL=debug to environment variables to check the response from vertex ai api.

max14354 commented 3 months ago

I am using us-east5 in the credentials and set it to that at every opportunity, unless there is somewhere else I should set it to that I'm unaware of. Running again with the environment variable set to debug wields these logs:

{"level":50,"time":1721648796897,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-97d26037","error":"fetch failed","msg":"Network error while checking key; trying this key again in a minute."} {"level":50,"time":1721648796898,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-0788445f","error":"fetch failed","msg":"Network error while checking key; trying this key again in a minute."} {"level":30,"time":1721648796898,"pid":46552,"module":"server","module":"key-checker","service":"gcp","callId":"ovrr9m","timeoutId":113,"msg":"Batch complete."} {"level":20,"time":1721648796898,"pid":46552,"module":"server","module":"key-checker","service":"gcp","callId":"xy91jy","timeoutId":128,"numEnabled":2,"numUnchecked":0,"msg":"Scheduling next check..."} {"level":20,"time":1721648796898,"pid":46552,"module":"server","module":"key-checker","service":"gcp","callId":"xy91jy","timeoutId":128,"key":"gcp-97d26037","nextCheck":"2024-07-22T11:47:36.897Z","jitteredDelay":45661.5429078002,"msg":"Scheduled next recurring check."}

And then after the scheduled recurring check:

{"level":20,"time":1721648842565,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-97d26037","msg":"Checking key..."} {"level":20,"time":1721648843993,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-97d26037","data":{"id":"msg_vrtx_01DBveB1t9vdagHgX9chX1Pu","type":"message","role":"assistant","model":"claude-3-sonnet-20240229","content":[],"stop_reason":"max_tokens","stop_sequence":null,"usage":{"input_tokens":11,"output_tokens":1}},"msg":"Response from GCP"} {"level":20,"time":1721648843993,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-97d26037","msg":"GCP key check complete."} {"level":30,"time":1721648843993,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-97d26037","families":["gcp-claude"],"msg":"Checked key."} {"level":20,"time":1721648843994,"pid":46552,"module":"server","module":"key-checker","service":"gcp","callId":"hxf0ur","timeoutId":866,"numEnabled":2,"numUnchecked":0,"msg":"Scheduling next check..."} {"level":20,"time":1721648843994,"pid":46552,"module":"server","module":"key-checker","service":"gcp","callId":"hxf0ur","timeoutId":866,"key":"gcp-0788445f","nextCheck":"2024-07-22T11:47:36.898Z","jitteredDelay":12004.058526194687,"msg":"Scheduled next recurring check."} {"level":20,"time":1721648856011,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-0788445f","msg":"Checking key..."} {"level":20,"time":1721648857368,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-0788445f","data":{"id":"msg_vrtx_01YNGue6QiDAyEw9krQjJhU6","type":"message","role":"assistant","model":"claude-3-sonnet-20240229","content":[],"stop_reason":"max_tokens","stop_sequence":null,"usage":{"input_tokens":11,"output_tokens":1}},"msg":"Response from GCP"} {"level":20,"time":1721648857369,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-0788445f","msg":"GCP key check complete."} {"level":30,"time":1721648857369,"pid":46552,"module":"server","module":"key-checker","service":"gcp","key":"gcp-0788445f","families":["gcp-claude"],"msg":"Checked key."} {"level":20,"time":1721648857369,"pid":46552,"module":"server","module":"key-checker","service":"gcp","callId":"w9hycm","timeoutId":1724,"numEnabled":2,"numUnchecked":0,"msg":"Scheduling next check..."} {"level":20,"time":1721648857369,"pid":46552,"module":"server","module":"key-checker","service":"gcp","callId":"w9hycm","timeoutId":1724,"key":"gcp-97d26037","nextCheck":"2024-07-22T13:17:23.993Z","jitteredDelay":6115401.529993923,"msg":"Scheduled next recurring check."}

cg-dot commented 3 months ago

I am using us-east5 in the credentials and set it to that at every opportunity, unless there is somewhere else I should set it to that I'm unaware of. Running again with the environment variable set to debug wields these logs:

@max14354 Hmm, that is a bit strange. The fetch failed error in your logs points to a network issue when the server was starting. Can you find any more detailed logs near the Now listening for connections. message?

max14354 commented 3 months ago

Sorry for the delay, I was double checking that the service account did indeed have access to Opus, here's what I've found: For some reason all attempts to list available models to the provided key return an empty response, which causes the reverse-proxy to default to base Sonnet, this default is what causes Base Sonnet to work for me because the request goes through regardless of checking while on other models the code checks if its available and thinks they aren't, not passing the request along to the GCP services.

Using the same service account and json key on a local Python Notebook yields positive results:

from anthropic import AnthropicVertex from google.oauth2 import service_account import google.auth

LOCATION = "us-east5"

SCOPES = ['https://www.googleapis.com/auth/cloud-platform'] # Adjust as needed credentials = service_account.Credentials.from_service_account_file( 'c:\Users\xxxxx\Downloads\key.json', scopes=SCOPES) client = AnthropicVertex(region=LOCATION, project_id="sillytavern-xxxxx", credentials=credentials)

message = client.messages.create( max_tokens=1024, messages=[ { "role": "user", "content": "Send me a recipe for banana bread.", } ], model="claude-3-opus@20240229", ) print(message)

This request goes through and returns:

Message(id='msg_vrtx_01Gh88jXEgCLo1nZD6vRWXap', content=[TextBlock(text="Here's a simple recipe for delicious banana bread:\n\nIngredients:\n- 2 cups all-purpose flour\n- 1 teaspoon baking soda\n- 1/4 teaspoon salt\n- 1/2 cup butter, softened\n- 3/4 cup brown sugar\n- 2 eggs\n- 2 1/3 cups mashed overripe bananas (about 4-5 bananas)\n- 1 teaspoon vanilla extract\n- 1/2 cup chopped walnuts (optional)\n\nInstructions:\n1. Preheat the oven to 350°F (175°C). Grease a 9x5-inch loaf pan.\n2. In a medium bowl, combine the flour, baking soda, and salt.\n3. In a large bowl, cream the butter and brown sugar until light and fluffy. Add the eggs one at a time, beating well after each addition.\n4. Mix in the mashed bananas and vanilla extract until combined.\n5. Gradually stir the dry ingredients into the wet mixture until just combined. Fold in the chopped walnuts, if using.\n6. Pour the batter into the prepared loaf pan and spread it evenly.\n7. Bake for 55-60 minutes, or until a toothpick inserted into the center comes out clean.\n8. Allow the bread to cool in the pan for 10 minutes before removing it and transferring it to a wire rack to cool completely.\n\nEnjoy your homemade banana bread!", type='text')], model='claude-3-opus-20240229', role='assistant', stop_reason='end_turn', stop_sequence=None, type='message', usage=Usage(input_tokens=15, output_tokens=358))

So everything is working except all attempts to check if there is access to specific models! I know I'm not the only one that had this issue from messages i've seen, is it possible to implement an optional workaround that assumes the key has access to the requested model?

cg-dot commented 3 months ago

@max14354 Could you update to the latest commit and try running it again? The latest commit includes more detailed error logging in the GCP checker.

max14354 commented 3 months ago

After updating:

{"level":30,"time":1721658801372,"pid":45624,"module":"server","build":"824adfb (main@cg-dot/oai-reverse-proxy)","nodeEnv":"production","diskSpace":{"diskPath":"C:","free":147834744832,"size":479324008448},"msg":"Startup complete."} {"level":30,"time":1721658801372,"pid":45624,"module":"server","port":7860,"interface":"0.0.0.0","msg":"Now listening for connections."} ... {"level":50,"time":1721658811480,"pid":45624,"module":"server","module":"key-checker","service":"gcp","key":"gcp-269b0288","cause":{"name":"ConnectTimeoutError","code":"UND_ERR_CONNECT_TIMEOUT","message":"Connect Timeout Error"},"error":"fetch failed","msg":"Network error while checking key; trying this key again in a minute."} {"level":30,"time":1721658811480,"pid":45624,"module":"server","module":"key-checker","service":"gcp","callId":"kn9kib","timeoutId":113,"msg":"Batch complete."} ... {"level":30,"time":1721658874042,"pid":45624,"module":"server","module":"key-checker","service":"gcp","key":"gcp-269b0288","families":["gcp-claude"],"msg":"Checked key."}

cg-dot commented 3 months ago

"cause":{"name":"ConnectTimeoutError","code":"UND_ERR_CONNECT_TIMEOUT","message":"Connect Timeout Error"},

This seems to indicate that there might be some network issues on your end. Are you running the Jupyter notebook and oai-reverse-proxy on the same machine? Have you tried running it on a different machine?

In theory, if the checker encounters network problems, it shouldn't work correctly even if the check is skipped.

If you'd like to implement this change as a temporary workaround, you can modify src/shared/key-management/gcp/provider.ts starting from line 54. Change modelFamilies: ["gcp-claude"], to modelFamilies: ["gcp-claude", "gcp-claude-opus"], and set both haikuEnabled and sonnet35Enabled to true. This should enable all models by default now.

max14354 commented 3 months ago

Both the notebook and the proxy were run on the same machine, can't think of any diferences between them that might mess with the network. However, implementing the changes you suggested now has Opus working! Even though the log still outputs a connection error when checking the key