benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
http://www.gunicorn.org
Other
9.79k stars 1.75k forks source link

[SOLVED] 500 internal error ONLY when running gunicorn as a service #2842

Closed Cory-Watson closed 2 years ago

Cory-Watson commented 2 years ago

Possibly related to #1059 and #1908

I'm working on developing an API for an existing (CLI based) python app I've written. I'm using fastapi + uvicorn + gunicorn + nginx to expose the API.

When I manually run the gunicorn command everything works as expected, but when I run it as a systemd service some of the API calls return a 500 error. What really puzzles me is that it's only SOME of my API calls that are producing this behavior.

Simplified Example

Running gunicorn from the CLI on the server

[foo@bar rtbh-python]$ /opt/rtbh-python/env/bin/gunicorn -w 4 -k uvicorn.workers.UvicornWorker --bind 127.0.0.1:8000 --pid /run/bh-api/rtbh-api.pid --error-logfile log/gunicorn.log --capture-output --log-level debug rtbh_python.main:app

Hitting the API from the client side

❯ curl -sX 'GET' \
  'https://foo.bar:8443/api/v1/rtbh-status' \
  -H 'accept: application/json' |jq .
{
  "blackhole01": {
    # bunch of json here...
  },
  "blackhole02": {
    # bunch of json here...
  }
}

Now when I run the gunicorn service via systemd this same API call instantly returns a 500 error. Just in case it might be relevant, when it's working (i.e. when I run the gunicorn command manually) the API usually has to chew on the request for a little while, but with the systemd setup it instantly fails with a 500.

systemd service file

# rtbh-api.service
[Unit]
Description=Gunicorn instance to serve RTBH API
After=network.target

[Service]
PIDFile=/run/rtbh-api/rtbh-api.pid
User=some-user
Group=some-group
WorkingDirectory=/opt/rtbh-python/
Environment="PATH=/opt/rtbh-python/env/bin"
ExecStartPre=+/bin/mkdir -p /run/rtbh-api
ExecStartPre=+/bin/chown some-user:some-group /run/rtbh-api
ExecStart=/opt/rtbh-python/env/bin/gunicorn -w 4 -k uvicorn.workers.UvicornWorker --bind 127.0.0.1:8000 --pid /run/rtbh-api/rtbh-api.pid --error-logfile log/gunicorn.log --capture-output --log-level debug rtbh_python.main:app
ExecReload=+/bin/kill -s HUP $MAINPID
ExecStop=+/bin/kill -s TERM $MAINPID
ExecStapPost=+/bin/rm -rf /run/rtbh-api
PrivateTmp=true

[Install]
WantedBy=multi-user.target

NOTE: A more basic service file produces the same behavior, so I don't think the issue is here, but I'm always open to suggestions/improvements.

sudo systemctl daemon-reload
sudo systemctl start rtbh-api
sudo systemctl status rtbh-api
● rtbh-api.service - Gunicorn instance to serve RTBH API
   Loaded: loaded (/opt/rtbh-python/systemd/rtbh-api.service; enabled; vendor preset: disabled)
   Active: active (running) ...
...
...
❯ curl -X 'GET' \
  'https://foo.bar:8443/api/v1/rtbh-status' \
  -H 'accept: application/json'
Internal Server Error%

Same thing happens on the /docs page of the API.

Screen Shot 2022-08-05 at 6 27 42 PM

Oddly enough, other API calls work just fine...

❯ curl -sX 'GET' \
  'https://foo.bar:8443/api/v1' \
  -H 'accept: application/json' |jq .
{
  "message": "Hello World",
  "app": "rtbh-python",
  "version": "1.4.0"
}

full_gunicorn_debug.log

NOTE: I think I captured a few failed attempts in the log file.

Looking at this snippet from the debug log it appears at face value to just be an error with my code or uvicorn (which may be the case), but I can run gunicorn/uvicorn manually or run the same code in the python REPL locally and it works fine.

[2022-08-05 18:26:28 -0400] [835131] [ERROR] Exception in ASGI application
Traceback (most recent call last):
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/fastapi/applications.py", line 269, in __call__
    await super().__call__(scope, receive, send)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/starlette/applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/starlette/exceptions.py", line 93, in __call__
    raise exc
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/starlette/exceptions.py", line 82, in __call__
    await self.app(scope, receive, sender)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/starlette/routing.py", line 670, in __call__
    await route.handle(scope, receive, send)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/starlette/routing.py", line 266, in handle
    await self.app(scope, receive, send)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/starlette/routing.py", line 65, in app
    response = await func(request)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/fastapi/routing.py", line 227, in app
    raw_response = await run_endpoint_function(
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/fastapi/routing.py", line 162, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/env/lib64/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/rtbh_python/main.py", line 130, in rtbh_status
    return rtbh_python.status.return_status_api(config=config)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/rtbh_python/status.py", line 171, in return_status_api
    uptime = utils.send_command_capture_stdout('uptime', rhost)
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/rtbh_python/tools/utils.py", line 94, in send_command_capture_stdout
    raise e
  File "/home/gitlab-runner/builds/XDs7E8__/0/cyber-sec/blocking/rtbh-python/rtbh_python/tools/utils.py", line 81, in send_command_capture_stdout
    p = subprocess.run('ssh {} {}'.format(rhost, command),
  File "/usr/lib64/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'ssh blackhole01 uptime' returned non-zero exit status 127.

Running the same thing locally in the REPL

>>> import rtbh_python as rtbh
>>> rtbh.main.rtbh_status()
{'blackhole01': {'status': 'up', 'uptime': '18:41:29 up 21 days,  3:20,  1 user,  load average: 0.00, 0.00, 0.00', 'bgp_info': {'ipv4Unicast': {'routerId': '192.168.12.2', 'as': 65001, 'vrfId': 0, 'vrfName': 'default', 'tableVersion': 3, 'ribCount': 5, 'ribMemory': 800, 'peerCount': 2, 'peerMemory': 42272, 'peers': {'192.168.12.1': {'hostname': 'border-router-1', 'remoteAs': 65002, 'version': 4, 'msgRcvd': 79, 'msgSent': 79, 'tableVersion': 0, 'outq': 0, 'inq': 0, 'peerUptime': '01:14:11', 'peerUptimeMsec': 4451000, 'peerUptimeEstablishedEpoch': 1659734839, 'prefixReceivedCount': 3, 'pfxRcd': 3, 'pfxSnt': 3, 'state': 'Established', 'idType': 'ipv4'}}, 'totalPeers': 1, 'dynamicPeers': 0, 'bestPath': {'multiPathRelax': 'false'}}}}, 'blackhole02': {'status': 'up', 'uptime': '18:41:28 up 21 days,  3:21,  0 users,  load average: 0.00, 0.00, 0.00', 'bgp_info': {'ipv4Unicast': {'routerId': '192.168.14.2', 'as': 65003, 'vrfId': 0, 'vrfName': 'default', 'tableVersion': 3, 'ribCount': 5, 'ribMemory': 800, 'peerCount': 2, 'peerMemory': 42272, 'peers': {'192.168.14.1': {'hostname': 'border-router-2', 'remoteAs': 65002, 'version': 4, 'msgRcvd': 79, 'msgSent': 79, 'tableVersion': 0, 'outq': 0, 'inq': 0, 'peerUptime': '01:14:10', 'peerUptimeMsec': 4450000, 'peerUptimeEstablishedEpoch': 1659734839, 'prefixReceivedCount': 3, 'pfxRcd': 3, 'pfxSnt': 3, 'state': 'Established', 'idType': 'ipv4'}}, 'totalPeers': 1, 'dynamicPeers': 0, 'bestPath': {'multiPathRelax': 'false'}}}}}

HELP??

Cory-Watson commented 2 years ago

Please @ me if you need any additional info to help reproduce. Thanks!

Cory-Watson commented 2 years ago

Turns out this wasn't a problem with gunicorn. Systemd wasn't sourcing the relevant ssh config.

Cory-Watson commented 2 years ago

Thought the ssh config was the problem, but turns out I was wrong... 🤷 Sorry for the premature close

Cory-Watson commented 2 years ago

This was not an issue with gunicorn/uvicorn.

Turns out I just needed to make sure to pass the needed PATH into the systemd service file. I think I may have been overwriting the PATH env variable with the following line out of my service file.

Environment="PATH=/opt/rtbh-python/env/bin"

Adding the needed PATH components seemed to have fixed the problem.

Environment="PATH=/opt/rtbh-python/env/bin:/bin:/sbin"