wandb / weave

Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.
https://wandb.me/weave
Apache License 2.0
659 stars 49 forks source link

TransportServerError: 502 Server Error: Bad Gateway for url: https://api.wandb.ai/graphql, "Not a JSON answer" #2196

Open kerriewu-sunrun opened 3 weeks ago

kerriewu-sunrun commented 3 weeks ago

I am using weave on AWS lambda and am running into an intermittent error when calling weave.init("project_name"). Something like 99% of invocations complete without any issue, so I'm not sure how to go about debugging this and am looking for any pointers or if this is a known issue. Thanks so much!

LAMBDA_WARNING: Unhandled exception. The most likely cause is an issue in the function code. However, in rare cases, a Lambda runtime update can cause unexpected function behavior. For functions using managed runtimes, runtime updates can be triggered by a function change, or can be applied automatically. To determine if the runtime has been updated, check the runtime version in the INIT_START log entry. If this error correlates with a change in the runtime version, you may be able to mitigate this error by temporarily rolling back to the previous runtime version. For more information, see https://docs.aws.amazon.com/lambda/latest/dg/runtimes-update.html
[ERROR] TransportServerError: 502 Server Error: Bad Gateway for url: https://api.wandb.ai/graphql
Traceback (most recent call last):
  File "/var/lang/lib/python3.11/site-packages/aws_lambda_powertools/logging/logger.py", line 447, in decorate
    return lambda_handler(event, context, *args, **kwargs)
  File "/var/lang/lib/python3.11/site-packages/aws_lambda_powertools/metrics/provider/base.py", line 205, in decorate
    response = lambda_handler(event, context, *args, **kwargs)
  File "/var/task/app.py", line 174, in lambda_handler
    return app.resolve(event, context)
  File "/var/lang/lib/python3.11/site-packages/aws_lambda_powertools/event_handler/api_gateway.py", line 1918, in resolve
    response = self._resolve().build(self.current_event, self._cors)
  File "/var/lang/lib/python3.11/site-packages/aws_lambda_powertools/event_handler/api_gateway.py", line 2025, in _resolve
    return self._call_route(route, route_keys)  # pass fn args
  File "/var/lang/lib/python3.11/site-packages/aws_lambda_powertools/event_handler/api_gateway.py", line 2103, in _call_route
    route(router_middlewares=self._router_middlewares, app=self, route_arguments=route_arguments),
  File "/var/lang/lib/python3.11/site-packages/aws_lambda_powertools/event_handler/api_gateway.py", line 407, in __call__
    return self._middleware_stack(app)
  File "/var/lang/lib/python3.11/site-packages/aws_lambda_powertools/event_handler/api_gateway.py", line 1314, in __call__
    return self.current_middleware(app, self.next_middleware)
  File "/var/lang/lib/python3.11/site-packages/aws_lambda_powertools/event_handler/api_gateway.py", line 1346, in _registered_api_adapter
    return app._to_response(next_middleware(**route_args))
  File "/var/task/app.py", line 67, in llm
    responses = get_completions(
  File "/var/task/app.py", line 92, in get_completions
    weave.init(project)
  File "/var/lang/lib/python3.11/site-packages/weave/trace_api.py", line 45, in init
    return weave_init.init_weave(project_name).client
  File "/var/lang/lib/python3.11/site-packages/weave/weave_init.py", line 121, in init_weave
    username = get_username()
  File "/var/lang/lib/python3.11/site-packages/weave/weave_init.py", line 25, in get_username
    return api.username()
  File "/var/lang/lib/python3.11/site-packages/weave/legacy/wandb_api.py", line 427, in username
    result = self.query(self.VIEWER_DEFAULT_ENTITY_QUERY)
  File "/var/lang/lib/python3.11/site-packages/weave/legacy/wandb_api.py", line 307, in query
    return session.execute(query, kwargs)
  File "/var/lang/lib/python3.11/site-packages/gql/client.py", line 1017, in execute
    result = self._execute(
  File "/var/lang/lib/python3.11/site-packages/gql/client.py", line 926, in _execute
    result = self.transport.execute(
  File "/var/lang/lib/python3.11/site-packages/gql/transport/requests.py", line 266, in execute
    raise_response_error(response, "Not a JSON answer")
  File "/var/lang/lib/python3.11/site-packages/gql/transport/requests.py", line 250, in raise_response_error
    raise TransportServerError(str(e), e.response.status_code) from e
andrewtruong commented 3 weeks ago

Hey @kerriewu-sunrun , sorry you ran into this

This error is happening at init time because we can't get the username for some reason. We'll need a few fixes here:

  1. The issue seems intermittent, so we should add a retry
  2. We should give you a much more useful error message!

I'll take a look this week!