firebase / firebase-admin-python

Firebase Admin Python SDK
https://firebase.google.com/docs/admin/setup
Apache License 2.0
1.02k stars 315 forks source link

Firebase Admin services, including `verify_id_token`, causing network hanging in production #711

Closed halesyy closed 12 months ago

halesyy commented 1 year ago

[READ] Step 1: Are you in the right place?

[REQUIRED] Step 2: Describe your environment

[REQUIRED] Step 3: Describe the problem

Steps to reproduce:

What happened? How can we make the problem occur? This could be a description, log/console output, etc.

Relevant Code:

// TODO(you): code here to reproduce the problem

Hi all, really struggling with this one. I've had no issues with my Firebase for months, and even up to last night I had no issues at all - everything was working smoothly. But, all of a sudden, in the morning, one of our production servers (gunicorn + flask) stopped responding to requests, returning nothing.

I then went into debug mode, restarting nginx, etc, and putting prints all over the code. This showed me that the prints would go all the way up to the statement decoded_token = auth.verify_id_token(id_token) which is apart of one of my authentication modules utilizing firebase-admin. I then check this against my local machine (doing token issuing locally, then authenticating locally) and that worked fine.

When I use requests.get("https://accounts.google.com/o/oauth2/auth") on my production server, it takes around 60 seconds to respond. Once it responds, it is a success 200 with the auth page. The same occurs with https://oauth2.googleapis.com/token (longer, had to time it out when it reached 3 minutes), and I imagine all following Google APIs. If I request one of my own sites, it resolves quickly as expected. If I curl -I https://oauth2.googleapis.com/token, I swiftly get:

HTTP/2 404
content-type: text/html
date: Wed, 26 Jul 2023 01:52:59 GMT
server: scaffolding on HTTPServer2
x-xss-protection: 0
x-frame-options: SAMEORIGIN
x-content-type-options: nosniff
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

In fact, I've also just tested requests.get("https://www.google.com") and get this hanging issue, but when doing curl "https://www.google.com", I get an instant response. This does not occur for GitHub, my portfolio website, Facebook, but does occur for YouTube (a Google service), I think this is the fundamental issue. Is this a Python issue?!

Further, if it was able to authenticate the token (rarely), the issue would move to the Firestore client and cause 30-60-300 second hangs and error out due to bad timeouts.

I really rarely resort to issuing bug requests and I really appreciate any help with this. I feel as if there is missing logging or some sort of issue occuring for my production server client which is not being logged, and I feel as if this hang is very odd - especially since it has not occurred for the last 3 months and started overnight. If I could get any further ideas on how to debug, that would be a great help.

Error from trying to verify the ID token

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/debian/module-autoreports/ar/databases/Firebase/FirebaseUtils.py", line 32, in token_to_uid
    decoded_token = auth.verify_id_token(id_token)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/firebase_admin/auth.py", line 220, in verify_id_token
    return client.verify_id_token(id_token, check_revoked=check_revoked)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/firebase_admin/_auth_client.py", line 127, in verify_id_token
    verified_claims = self._token_verifier.verify_id_token(id_token)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/firebase_admin/_token_gen.py", line 293, in verify_id_token
    return self.id_token_verifier.verify(id_token, self.request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/firebase_admin/_token_gen.py", line 392, in verify
    verified_claims = google.oauth2.id_token.verify_token(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/oauth2/id_token.py", line 133, in verify_token
    certs = _fetch_certs(request, certs_url)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/oauth2/id_token.py", line 99, in _fetch_certs
    response = request(certs_url, method="GET")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/firebase_admin/_token_gen.py", line 265, in __call__
    return self._delegate(
           ^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/auth/transport/requests.py", line 193, in __call__
    response = self.session.request(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/cachecontrol/adapter.py", line 57, in send
    resp = super(CacheControlAdapter, self).send(request, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/home/debian/.local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()
  File "/home/debian/.local/lib/python3.11/site-packages/urllib3/connection.py", line 363, in connect
    self.sock = conn = self._new_conn()
                       ^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
KeyboardInterrupt

Error from trying to receive data from Firestore

  File "/home/debian/.local/lib/python3.11/site-packages/grpc/_channel.py", line 475, in __next__
    return self._next()
           ^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/grpc/_channel.py", line 881, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.DEADLINE_EXCEEDED
        details = "Deadline Exceeded"
        debug_error_string = "UNKNOWN:Deadline Exceeded {grpc_status:4, created_time:"2023-07-26T10:20:19.590514685+10:00"}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/debian/.local/lib/python3.11/site-packages/google/api_core/retry.py", line 191, in retry_target
    return target()
           ^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/api_core/timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 166, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.DeadlineExceeded: 504 Deadline Exceeded

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/debian/.local/lib/python3.11/site-packages/flask/app.py", line 2528, in wsgi_app
    response = self.full_dispatch_request()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/flask_cors/extension.py", line 165, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
                                                ^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/pp2-backend/custom_routes/background/background.py", line 601, in remote_period
    response = safe_store_access(store_id)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/pp2-backend/custom_routes/background/background.py", line 219, in safe_store_access
    store = db.utils.get("entity_pharmacy", store_id)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/ar/databases/Firebase/FirebaseUtils.py", line 68, in get
    snapshot = collection_reference.document(document).get()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/cloud/firestore_v1/document.py", line 403, in get
    response_iter = self._client._firestore_api.batch_get_documents(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/cloud/firestore_v1/services/firestore/client.py", line 910, in batch_get_documents
    response = rpc(
               ^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/api_core/gapic_v1/method.py", line 113, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/api_core/retry.py", line 349, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "/home/debian/.local/lib/python3.11/site-packages/google/api_core/retry.py", line 207, in retry_target
    raise exceptions.RetryError(
google.api_core.exceptions.RetryError: Deadline of 300.0s exceeded while calling target function, last exception: 504 Deadline Exceeded
halesyy commented 1 year ago

Okay - update - I've investigating this, and GPT4 gave me a few things to test. The first being to disable IPv6 on my server (after using ip a to find that my server does have IPv6) to see if that fixes it. For some reason, after adding the following config lines to /etc/sysctl in sudo:

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

And running sudo sysctl -p - this fixed it. I can now do requests.get to all Google services.

I'm also going to try to setup Google DNS to hopefully resolve this in the future.

I feel as if the issue could have been with my server provider - their IPv6 infra went down or something? Or Google changed some resolver in their Australian cloud services? The most annoying errors are the ones that occur outside of the codebase..

lahirumaramba commented 12 months ago

It could have been a network configuration issue in your environment. Glad you found a resolution. I will close this issue for now.