falling-fruit / falling-fruit-api

API (v0.3) for Falling Fruit
https://fallingfruit.org/api/0.3
GNU General Public License v3.0
5 stars 1 forks source link

Photo upload sometimes fails due to unknown error #26

Closed ezwelty closed 1 year ago

ezwelty commented 1 year ago

I am tearing my hair out trying to debug this problem. Photo upload (POST /api/0.3/photos) fails with HTTP 500 response in certain conditions:

I cannot find where the error originates. The request does not seem to reach the API, nor even the lighttpd web server (at least, I find nothing in either website access logs nor server error logs when request fails). Could it originate from Apache? Or even Cloudflare? Not the latter according to the response body (see below), which is devoid of cloudflare (see https://developers.cloudflare.com/support/troubleshooting/cloudflare-errors/troubleshooting-cloudflare-5xx-errors/#error-500-internal-server-error).

Could it be high traffic? Lighttpd error log has many such lines:

2023-05-21 21:02:24: (gw_backend.c.315) gw-server re-enabled: tcp:127.0.0.1:3301 127.0.0.1 3301  
2023-05-21 21:05:28: (gw_backend.c.236) establishing connection failed: Connection refused socket: tcp:127.0.0.1:3301 
2023-05-21 21:05:28: (gw_backend.c.939) all handlers for {request_url} on  are down.
2023-05-21 21:05:30: (gw_backend.c.315) gw-server re-enabled: tcp:127.0.0.1:3301 127.0.0.1 3301
...

Request

POST https://fallingfruit.org/api/0.3/photos?api_key={api_key} with form-data body with file: {photo_path}

:authority: beta.fallingfruit.org
:method: POST
:path: /api/0.3/photos?api_key={api_key}
:scheme: https
accept: application/json, text/plain, */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9,fr-FR;q=0.8,fr;q=0.7,de;q=0.6
content-length: 22032
content-type: multipart/form-data; boundary=----WebKitFormBoundarytgWDaB8BBekMo9JT
cookie: __utma=71168135.1257686696.1678958397.1678958397.1678958397.1; __utmc=71168135; __utmz=71168135.1678958397.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); _session_id=218d88dd6060fd9a6dd45fe12607497e; welcome_modal_shown=yes
origin: https://beta.fallingfruit.org
referer: https://beta.fallingfruit.org/locations/3383/@40.04059999999998,-105.26319,16z
sec-ch-ua: "Google Chrome";v="111", "Not(A:Brand";v="8", "Chromium";v="111"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "macOS"
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36

Response

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>500 Internal Server Error</title>
</head>
<body>
<h1>500 Internal Server Error</h1>
</body>
</html>
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400
cf-cache-status: DYNAMIC
cf-ray: 7cafc2862cf81d90-FRA
content-type: text/html
date: Sun, 21 May 2023 21:02:21 GMT
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=L1M%2BXhIy7II6I28gxj9XsCnn8xgjBzv9Guk0YAMmLWgd7PPrj6yWrQ23TrWid9dY479qEilr2dYV90yWhHaBfOdcZQjnd4R9r5zdyHZRXrM90M7P4Is5jBOPATXR8gqnpAVqOzQHBmA%3D"}],"group":"cf-nel","max_age":604800}
server: cloudflare
ezwelty commented 1 year ago

Problem was caused by pm2 restarting the process every time a photo was uploaded... See https://pm2.keymetrics.io/docs/usage/watch-and-restart/. Setting watch: false fixed the problem.