chiefwigms / picobrew_pico

MIT License
149 stars 63 forks source link

502 Bad Gateway / SERVER COMM ERROR CODE 7 #307

Open meirion opened 2 years ago

meirion commented 2 years ago

Hi, using a Pi ZeroW here. Have successfully done a couple of brews with the server in Jan and March this year. Got the Pi back out again today to do a deep clean on the machine and having trouble connecting.

Picobrew machine shows: SERVER COMMUNICATION ERROR CODE 7on the screen

Connecting to PICOBREW AP and accessing https://picobrew.com or 192.168.72.1 or Connecting to my home wifi, and accessing https://192.168.0.88/ (Pi IP) all give:

502 Bad Gateway
nginx/1.14.2

I can access the picobrew_server share via samba fine.

I'm not sure where to go from here, can someone advise please? Because I'd love to get the machine brewing again!

thanks very much

tmack8001 commented 2 years ago

this to me is signaling that the python+flask process systemctl start rc.local isn't starting up.

Without logs from the device I can't be certain what is happening. After powering on did you leave the device for 10-45minutes before trying? If you are connected to the internet / home wifi it can take a while to update if you truly haven't restarted in over 9 months due to the number of changes that might be ready to pull to your device upon startup.

If you still can't get it to function backup the samba share (really only recipes, but could also backup sessions) and start all over again as the SD card may have been corrupted when powered off.

BuckoWA commented 2 years ago

I too am getting this error. My setup is uses a separate DD-WRT router for dnsmasq. The error I see in picobrew.error.log

I don't know if it's my setup with router or something in nginx configuration. Will keep you all posted as I investigate.

tmack8001 commented 2 years ago

Can either of you access the raspberrypi web interface normally from chrome/safari/edge/etc. Also what version are you running? How frequently does this error occur?

BuckoWA commented 2 years ago

Can normally access the pi from the web interface on multiple browsers. The server then crashes and I get the same 502 error as above upon refreshing web page. If I restart the server and refresh the browser, all is good until it crashes again anywhere from 30 minutes to a couple of hours later. Always get the upstream prematurely closed connection in the picobrew.error.log. I’m testing something now which appears to work.

tmack8001 commented 2 years ago

When the server crashes what is at the end of the logs when viewing them systemctl status rc.local -n <num-of-lines> where is large enough to see a problematic line in the logs?

This is the same or similar to what the UI would give you, however if you need to restart the log files (in memory) are wiped out and not available. Best to grab from ssh or terminal session if you can, as you are making custom changes I'm sure you can 😉.

BuckoWA commented 2 years ago

Ok - reverted the changes I was testing late last night and restarted. Got a crash some time overnight. Here is the contents of rc.local status.

Loaded: loaded (/lib/systemd/system/rc-local.service; enabled-runtime; vendor preset: enabled) Drop-In: /usr/lib/systemd/system/rc-local.service.d └─debian.conf /etc/systemd/system/rc-local.service.d └─ttyoutput.conf Active: active (exited) since Fri 2021-12-31 14:00:47 PST; 1 weeks 1 days ago Docs: man:systemd-rc-local-generator(8) Tasks: 0 (limit: 2062) CGroup: /system.slice/rc-local.service

Dec 31 14:00:47 raspberrypi systemd[1]: Starting /etc/rc.local Compatibility... Dec 31 14:00:47 raspberrypi systemd[1]: Started /etc/rc.local Compatibility.

tmack8001 commented 2 years ago

By "crash" what do you mean? Simply it was working before and then started 502ing? That rc.local log doesn't have anything is rc.local where you have the process name for picobrew_pico setup (this is the process we use in the RaspberryPi image provided, but I know you are a custom setup...)? Did you access the webserver at all?

BuckoWA commented 2 years ago

Sorry - l need to be clearer. I’ve got a dedicated router doing dnsmasq and I’m just running the code from the pi’s command line - python3 server.py 0.0.0.0 8080

When I get the 502, the python process has ended. Confirmed that using ps. I recently generated the SSL certificate and added symbolic link to the picobrew.com.conf to nginx whereas previously I was just running on port 80 (just python3 server.py). Thanks.

tmack8001 commented 2 years ago

What do you get as the python output when you experience the 502? After the 502 the server isn't running anymore?

tmack8001 commented 2 years ago

Here is the rc.local that the built raspberrypi image uses if interested to compare python startup differences.

https://github.com/chiefwigms/picobrew_pico/blob/master/scripts/pi/00-run-chroot.sh#L296-L324

tmack8001 commented 2 years ago

If one starts python via python3 server.py <interface> <port> & (& means in the background). This is like how we do in the rc.local service included in the pre-built RaspberryPi image. This service gets auto restarted by the Raspbian OS upon fatal failures and such.

Anyways if you start the python server via python/python3 there is not a timeout within the Python code. So any timeout is occurring at the Nginx layer. Whereas if you were to say use gunicorn to manage the python threads/workers that are running and answering requests there would be a default timeout of 30s at the python layer which could be getting in the way. The default timeouts within nginx are 60s (and not customized in our scripts) so you could try adding this to bump those timeout values up higher to see if that resolves the issues:

// proxy_connect_timeout 60; // this is the time to establish a connection, not the problem here... ignore
proxy_send_timeout 605;
proxy_read_timeout 605;

There could be a case where too much data is being attempted to be sent to the frontend (# of simultaneous graphs / # of active devices). I found this to be the case with the archive view which loaded ALL data for ALL sessions ALL the time... when most of the time the user only cares about a recently run session... so switched to max of 10 loaded at a time (but depending on how long one fermentation or brew session was and how much data was contained within it even just one could overwhelm the system enough to drag performance to a halt... I have some ideas there to reduce the overall size of a session log, but haven't done anything with that idea yet).

Anyways if size of the response is an issue try changing the buffer size for nginx (default 1024m).

proxy_max_temp_file_size: 4096m // keep in mind the amount of available memory you have on your specific Pi setup
BuckoWA commented 2 years ago

Thanks. I'll give it a shot and let you know results. Appreciate the support!

tmack8001 commented 2 years ago

@BuckoWA any update here?

BuckoWA commented 2 years ago

@tmack8001 Thanks for the ping.

I tried the following within the nginx conf file: proxy_send_timeout 600s; proxy_read_timeout 600s; proxy_connect_timeout 75s;

On my latest brew sessions, it ran smoothly during the brew session on two brews, but had an issue during fermentation running two Picoferms and two iSpindels. Checking the logs shows this error: connect() failed (111: Connection refused) while connecting to upstream

I did not try increasing the nginx buffer size but will try that next. Looking back on some old error logs I do see some "worker process exited on signal 9" errors which I believe is memory.? I did update my rc.local to be similar to the chroot.sh script but forgot to systemctl the output after it failed.