markwal / OctoPrint-PolarCloud

OctoPrint plugin that connects OctoPrint to the Polar3D cloud. You easily monitor your printer from outside your LAN without arcane setup of port forwarding, etc. Plus, take advantage of the rapidly evolving Polar Cloud capabilities and UI.
GNU Affero General Public License v3.0
15 stars 8 forks source link

Printer offline #9

Closed mifbody closed 7 years ago

mifbody commented 7 years ago

Printer is online but PolarCloud is saying the printer is offline. Are there any ports that have to be opened?

dnewman-polar3d commented 7 years ago

What's the printer's serial number (in the Polar Cloud)? I can definitively tell you if it's connected at the moment or not.

dnewman-polar3d commented 7 years ago

Oddly, I found my printer not connected and claiming (in its logs) it was unable to get a connection.... Not too sure what was up. Logging in the cloud showed that it had closed it's connection at 21:32:03 UTC. I restarted Octoprint and it connected up just find at 22:02:22 UTC. And there were plenty of other Octoprint printers happily connected. (And other printers keep their connections nailed up for days and days.)

So, we should keep an eye on this. Python may be using a different heartbeat to keep connections alive that isn't quite meshing right when idle with Node.js? ("ping" and "pong" stuff in WebSockets.)

dnewman-polar3d commented 7 years ago
2017-06-25 20:42:13,236 - octoprint.plugins.polarcloud - WARNING - unable to create socket to Polar Cloud, check again in 60 seconds
2017-06-26 21:32:08,069 - octoprint.plugins.polarcloud - WARNING - unable to create socket to Polar Cloud, check again in 2880 seconds

It (1) should be more aggressive on retries. Way more aggressive. And I'm suspecting that yeah, something up with heartbeats.

dnewman-polar3d commented 7 years ago

For reference, cloud uses the defaults for socket.io-client (Node.js)

  1. reconnect automatically: true
  2. reconnectionAttempts: infinite
  3. reconnectionDelay: 1000ms with a random 500 to 1500ms initial delay added (to avoid the thundering herd issue)
  4. reconnections: defaults to 5000, not clearly document ("Manager" section of https://socket.io/docs/client-api/). It's not the number of attempts, but how subsequent reconnect attempts are delayed.
  5. Ping and Pong defaults and info at https://socket.io/docs/server-api/.
dnewman-polar3d commented 7 years ago

@mifbody note that if the Octoprint is not connected to the printer via USB, you may see an offline. Plan is that you'll instead see a message in the cloud that it's not connected and a CONNECT button will appear; however, I do not know if the plugin is yet sending that status code or not to the cloud.

mifbody commented 7 years ago

@dnewman-polar3d - Sorry, serial number is OP000137.

dnewman-polar3d commented 7 years ago

Looks like it received a serial number but then never connected back to the cloud. Once it has a serial number, it disconnects and then reconnects using that number. But wait, it got an SN, reconnected, sent a hello, and then within the same second, sent a status update. Since much of this is all asynchronous, the status update was likely sent before the validation of the hello had completed.

screen shot 2017-06-26 at 7 34 36 pm

So, the server closed the connection. Now, why the plugin didn't try to reconnect, I don't know offhand. Mark can look at that. However, I'll also suggest to mark that there be a 5s delay between sending the hello to the server and then sending any commands. That gives the server time enough to pull info from the db to validate the signed challenge sent to it and then mark the socket as validated.

dnewman-polar3d commented 7 years ago

In the meantime, you can likely just restart the octoprint server process. What sort of platform is it running on? Something faster than your typical RPi?

mifbody commented 7 years ago

In the middle of a print so I'll have to restart it in awhile. It's running on a standard RPi.

dnewman-polar3d commented 7 years ago

Ahhh, I see what the problem is... The plugin failed to disconnect and reconnect after being assigned a serial number. It should have closed the connection and opened a new one.

Issue being, the socket (connection) is flagged with the connection time. It then must validate within 60 seconds of connecting. If it attempts any commands requiring validation (e.g., sending a status update), those commands are ignored within that 60 second window. Outside of the window -- past 60 seconds -- the connection is summarily closed if a command is issued but validation has not completed.

The exception is commands not requiring validation: e.g., registering. So, you printer connected, had a couple of failures (wrong pin it looks?), and then eventually got a serial number. But by that time the 60s had elapsed. So, it then tried to validate -- that was ongoing -- and it also sent a status update. Validation had not finish when that status update was sent, 60 seconds had long elapsed, so the connection was closed.

mifbody commented 7 years ago

Nice find! The wrong pin was right - I learned the hard way that the 4 digit pin polar cloud assigned on its own wouldn't work - I had to enter my own pin, then save it on polar then enter it into the plug in and save.

On Mon, Jun 26, 2017, 10:52 PM dnewman-polar3d notifications@github.com wrote:

Ahhh, I see what the problem is... The plugin failed to disconnect and reconnect after being assigned a serial number. It should have closed the connection and opened a new one.

Issue being, the socket (connection) is flagged with the connection time. It then must validate within 60 seconds of connecting. If it attempts any commands requiring validation (e.g., sending a status update), those commands are ignored within that 60 second window. Outside of the window -- past 60 seconds -- the connection is summarily closed if a command is issued but validation has not completed.

The exception is commands not requiring validation: e.g., registering. So, you printer connected, had a couple of failures (wrong pin it looks?), and then eventually got a serial number. But by that time the 60s had elapsed. So, it then tried to validate -- that was ongoing -- and it also sent a status update. Validation had not finish when that status update was sent, 60 seconds had long elapsed, so the connection was closed.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/markwal/OctoPrint-PolarCloud/issues/9#issuecomment-311238315, or mute the thread https://github.com/notifications/unsubscribe-auth/AMXAwJLHxQcs1k3KuI2ClkjXZhIuETeUks5sIG6IgaJpZM4OF5GC .

-- Brian Meissen Meissenation, LLC http://www.meissenation.com/

markwal commented 7 years ago

I believe the issues discussed here are fixed in current master