mr-canoehead / network_performance_monitor

Network Performance Monitor - a portable tool for troubleshooting performance issues with home networks
GNU General Public License v3.0
84 stars 21 forks source link

New Installs Display Empty Dashboard, Apparent socket.io Incompatability #28

Closed jamey closed 3 years ago

jamey commented 3 years ago

My previous install was broken, so I started from scratch.

This is on a fresh install, RPI4, Raspbian 2020-12-02. As per the instructions, apt update and apt upgrade were run prior to installation.

The install went okay, but once set up and commissioned, the dashboard displays no data - just empty frames where the information should be. The nginx access logs show 400 return codes to any request to /socket.io/*.

The nginx error logs show this. 2020/12/27 15:16:20 [error] 633#633: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.1.136, server: netperf-dashboard, request: "GET /socket.io/?EIO=3&transport=polling&t=NQbs3rN HTTP/1.1", upstream: "http://[::1]:8000/socket.io/?EIO=3&transport=polling&t=NQbs3rN", host: "hostname", referrer: "http://hostname/"

It seems to be the netperf-dashboard that controls this. The logs there show the following.

Dec 27 15:02:40 palantir systemd[1]: Starting Network Performance Monitor Dashboard... Dec 27 15:02:40 palantir systemd[1]: Started Network Performance Monitor Dashboard. Dec 27 15:02:41 palantir env[580]: [2020-12-27 15:02:41 -0500] [580] [INFO] Starting gunicorn 19.9.0 Dec 27 15:02:41 palantir env[580]: [2020-12-27 15:02:41 -0500] [580] [INFO] Listening at: http://0.0.0.0:8000 (580) Dec 27 15:02:41 palantir env[580]: [2020-12-27 15:02:41 -0500] [580] [INFO] Using worker: eventlet Dec 27 15:02:41 palantir env[580]: [2020-12-27 15:02:41 -0500] [735] [INFO] Booting worker with pid: 735 Dec 27 15:02:49 palantir env[580]: The client is using an unsupported version of the Socket.IO or Engine.IO protocols (further occurrences of this error will be logged with level INFO) Dec 27 15:06:54 palantir env[580]: [2020-12-27 15:06:54 -0500] [580] [CRITICAL] WORKER TIMEOUT (pid:735) Dec 27 15:06:54 palantir env[580]: [2020-12-27 15:06:54 -0500] [735] [INFO] Worker exiting (pid: 735) Dec 27 15:06:54 palantir env[580]: [2020-12-27 15:06:54 -0500] [819] [INFO] Booting worker with pid: 819 Dec 27 15:06:55 palantir env[580]: The client is using an unsupported version of the Socket.IO or Engine.IO protocols (further occurrences of this error will be logged with level INFO)

Cursory investigation indicates that there is indeed a compatability matrix for socket.io, and I've somehow run afoul of it. My python skills aren't great, so I've largely exhausted my troubleshooting options here.

jamey commented 3 years ago

Did a little more digging. It looks like, in dashboard.html, you're pulling in socket.io.js version 2.2.0. The version of socket.io I have in dist-packages is 5.0.4.

mr-canoehead commented 3 years ago

It seems that the socket.io library has had a major revision recently, the dashboard web page references the older 2.x client script which isn't compatible with the new 3.x socket.io library that now gets installed on new systems. The solution is to update the reference in the dashboard web page, I'll do this in the next day or two.

To patch your existing installation, try the following: 1) edit the dashboard web page source file:

sudo nano /opt/netperf/dashboard/html/dashboard.html

2) change the following line:

<script src="//cdnjs.cloudflare.com/ajax/libs/socket.io/2.2.0/socket.io.js" integrity="sha256-yr4fRk/GU1ehYJPAs8P4JlTgu0Hdsp4ZKrx8bDEDC3I=" crossorigin="anonymous"></script>

to:

<script src="https://cdnjs.cloudflare.com/ajax/libs/socket.io/3.0.1/socket.io.js" integrity="sha384-eR+bbT85XPh+//KUxVKFPo0h5uhDtuh0jm6tk7oaZxmM8KbdLJGayIFp09NGsJA5" crossorigin="anonymous"></script>

3) restart NGINX:

sudo systemctl restart nginx

4) reload the dashboard web page in your browser

jamey commented 3 years ago

Can confirm - that does the trick.

mr-canoehead commented 3 years ago

Update: corrected script reference is:

<script src="https://cdnjs.cloudflare.com/ajax/libs/socket.io/3.0.1/socket.io.js" integrity="sha384-eR+bbT85XPh+//KUxVKFPo0h5uhDtuh0jm6tk7oaZxmM8KbdLJGayIFp09NGsJA5" crossorigin="anonymous"></script>

I tested this on several platforms with various browsers (including Safari on IOS 14), the dashboard web page loads correctly on all of them. I have updated the repository with this corrected script reference.

mr-canoehead commented 3 years ago

I left this open for a while for visibility in case other users encountered the same issue. Since the issue has been fixed in the repo and new installations don't experience it, I figure it's safe to close this issue now.

Thanks for submitting it!

Chris

preese commented 3 years ago

This issue seems to be back. PDF generate fine but no content on the web page.

Here a nginx error log line: 2021/05/07 12:01:03 [error] 3413#3413: *337 connect() failed (111: Connection refused) while connecting to upstream, client: xxx.xxx.xx.xx, server: netperf-dashboard, request: "GET /socket.io/?EIO=4&transport=polling&t=Nb8D6GB HTTP/1.1", upstream: "http://127.0.0.1:8000/socket.io/?EIO=4&transport=polling&t=Nb8D6GB", host: "xx.xxx.xx.xx", referrer: "http://xx.xx.xx.xx/"

I did see a new version of socket.io so i tired the same fix as last time, updated to the new version, 4.0.1:

But no luck.

Thanks for taking a look.

I'm setting a RPI4 up to send off to my cousin who has a new Starlink install. So he can watch his speed and latency.

Phil

mr-canoehead commented 3 years ago

Hi Phil! I was able to replicate this on my test system today, it seems to be a new issue related to the eventlet worker library; the newest version (eventlet 0.31.0) seems to be incompatible with gunicorn. If I remove eventlet 0.31.0 and install an older version (eventlet 0.30.2) the web interface works as expected. Here are the steps I took:

1) remove eventlet 0.31.0:

sudo pip3 uninstall eventlet

2) install eventlet version 0.30.2:

sudo pip3 install eventlet==0.30.2

3) reboot:

sudo reboot

Cheers, Chris

preese commented 3 years ago

Hi Chris,

The older eventlet has solved the problem!  Thanks for your fast solution.

I didn't find, in two RPI4 tests, that ping had the wrong permissions but your note on the wiki will insure it won't be a problem.

Phil

On 5/7/21 9:42 PM, Chris wrote:

Hi Phil! I was able to replicate this on my test system today, it seems to be a new issue related to the eventlet worker library; the newest version (eventlet 0.31.0) seems to be incompatible with gunicorn. If I remove eventlet 0.31.0 and install an older version (eventlet 0.30.2) the web interface works as expected. Here are the steps I took:

  1. remove eventlet 0.31.0:

|sudo pip3 uninstall eventlet|

  1. install eventlet version 0.30.2:

|sudo pip3 install eventlet==0.30.2|

  1. reboot:

|sudo reboot|

While testing I found another new issue: in the latest release of Raspberry Pi OS the |pi| user cannot run the |ping| command without sudo permissions (I get the error |ping: socket: Operation not permitted|). This breaks the network test scripts. To fix this issue, modify the user permissions on the ping executable:

|sudo chmod u+s /bin/ping|

After doing this the scripts ran normally.

Cheers, Chris

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/mr-canoehead/network_performance_monitor/issues/28#issuecomment-835089572, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADHDJYPNIHJ3B5KCS34ZNDTTMS6J5ANCNFSM4VMJATJQ.

mr-canoehead commented 3 years ago

Glad that solution worked! I'll close this issue and create a new one specific to eventlet and update the wiki with patching steps. As for the ping permission issue, that seems to be limited to my particular installation (my root filesystem is on a USB flash drive, something must have happened when I rsynced the filesystem to it). For standard installations this does not seem to be an issue.

Chris