microbiomedata / nmdc-runtime

Runtime system for NMDC data management and orchestration
https://microbiomedata.github.io/nmdc-runtime/
Other
4 stars 3 forks source link

API returns HTTP 500 when request lacks `User-Agent` header (Analytics middleware raises `KeyError`) #515

Closed eecavanna closed 1 month ago

eecavanna commented 1 month ago

Offending line of code:

https://github.com/microbiomedata/nmdc-runtime/blob/fb9131649f0d6ab70e7b1f105b569837afa63d17/nmdc_runtime/api/analytics.py#L56

Example server log:

INFO:     10.42.5.67:35904 - "GET / HTTP/1.1" 500 Internal Server Error
2024-05-06T16:43:26.837967963Z ERROR:    Exception in ASGI application
  + Exception Group Traceback (most recent call last):
2024-05-06T16:43:26.837978723Z   |   File "/usr/local/lib/python3.10/site-packages/starlette/_utils.py", line 87, in collapse_excgroups
  |     yield
2024-05-06T16:43:26.837986353Z   |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/base.py", line 190, in __call__
2024-05-06T16:43:26.837990423Z   |     async with anyio.create_task_group() as task_group:
  |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 678, in __aexit__
  |     raise BaseExceptionGroup(
2024-05-06T16:43:26.837999803Z   | exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
2024-05-06T16:43:26.838002594Z   +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi
2024-05-06T16:43:26.838011314Z     |     result = await app(  # type: ignore[func-returns-value]
    |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
2024-05-06T16:43:26.838016814Z     |     return await self.app(scope, receive, send)
    |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-05-06T16:43:26.838022314Z     |     await super().__call__(scope, receive, send)
    |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__
    |     await self.middleware_stack(scope, receive, send)
2024-05-06T16:43:26.838030444Z     |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__
    |     raise exc
2024-05-06T16:43:26.838035804Z     |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__
2024-05-06T16:43:26.838038545Z     |     await self.app(scope, receive, _send)
    |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/base.py", line 189, in __call__
2024-05-06T16:43:26.838043855Z     |     with collapse_excgroups():
    |   File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
    |     self.gen.throw(typ, value, traceback)
2024-05-06T16:43:26.838065396Z     |   File "/usr/local/lib/python3.10/site-packages/starlette/_utils.py", line 93, in collapse_excgroups
    |     raise exc
2024-05-06T16:43:26.838072176Z     |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/base.py", line 191, in __call__
    |     response = await self.dispatch_func(request, call_next)
2024-05-06T16:43:26.838077136Z     |   File "/code/nmdc_runtime/api/analytics.py", line 56, in dispatch
    |     "user_agent": request.headers["user-agent"],
    |   File "/usr/local/lib/python3.10/site-packages/starlette/datastructures.py", line 565, in __getitem__
2024-05-06T16:43:26.838084426Z     |     raise KeyError(key)
2024-05-06T16:43:26.838086606Z     | KeyError: 'user-agent'
    +------------------------------------

Opinion:

I propose this be updated to tolerate a request whose user-agent is blank. I don't consider it the analytics middleware's responsibility to validate requests. If someone wants to block requests that lack a user-agent, I'd propose that be handled elsewhere (explicitly).

shreddd commented 1 month ago

Seems like a simple potential fix to use request.headers.get("user-agent", "") or request.headers.get("user-agent") depending on what you want your default to be.

ssarrafan commented 1 month ago

@eecavanna let me know if I can help nudge someone to review this so it can be closed by tomorrow.

eecavanna commented 1 month ago

Thanks! Hi @shreddd, I created a PR containing a change that, if merged, would fix this issue. The PR is linked above. I will message you on Slack about reviewing it/merging it in.