HaddingtonDynamics / Dexter

GNU General Public License v3.0
364 stars 85 forks source link

Remote operation of Dexter via Internet #31

Open JamesNewton opened 6 years ago

JamesNewton commented 6 years ago

It will be desirable to operate Dexters remotely via the Internet either to monitor, operate, and/or repair remote devices, or to make Dexters available for general purpose work.

Dexter supports socket connections natively, but those are typically limited to the local internet and do not pass through firewalls without additional (and often not allowed) holes being opened. WebSockets can operate freely between most networks, but despite the name, are NOT the same as native socket connections. WebSocket connections can be supported on the Dexter via a simple NodeJS proxy server, but getting out to the internet requires "chat" server accessible from the internet in general. The local Dexters NodeJS server would connect to the DexChat server, registering the robot as active and opening a WebSocket connection. Humans would log into the DexChat server via browser which would be served a web page which would then open WebSocket connections. The DexChat server would list available robots, perhaps filtering the list according to access rights.

The human could initiate a connection to the robot through the DexChat server. Messages would go to the server, which would then relay it to the open WebSocket connection to that Dexter. The NodeJS server on the Dexter would receive the message, relay it as a true socket message to the firmware on localhost, pick up the response, and relay it back to the chat server via websocket. The chat server would then relay that back to the human via the websocket connection to the browser. This is basically, like a Private Message on a standard chat server.

Latency

The major concern here is the speed of the connection. The NodeJS Proxy server takes about 1.6ms between Dexter and a Chrome web browser. Because NodeJS is known for very rapid response times and we are a JavaScript heavy house, we will try that first.

Latency compensation

But as we move out into the internet, latency of hundreds of ms are not unknown. As a result, Dexters haptic feedback system can get into oscillations or just lag, causing touch feedback to become unusable. The solution is probably to work towards predicting the next movement when sending data from the robot with the human, and predicting forces when sending data back. Obviously, the former is easier than the latter, but 3D scanning the work area and proximity detection before actual touch can help.

Forward Latency compensation

We've already experimented with predicting where the operator is moving the arm via spline curve fitting and it seems like that will work well, but we are concerned about noise in the system and it's needs more testing.

Back Latency compensation

Understanding what forces the remote arm will encounter /in the future/, which is required for latency compensation, is very difficult. Understanding the 3D environment and simulating when the arm will contact an object is one possible approach. Point cloud data is accesable via existing 3D scanners, and can be localized to the end effector to reduce data size. Having to transmit all the points back to the controller is a non-starter; it will be critical to difference the current scan and the prior scan and only transmit errors, or to convert point cloud to mesh, or some combination of that and other tactics.

The video game industry has done important work on this and we hope game developers will be interested in adding haptic feedback via the robot arm.

JamesNewton commented 6 years ago

Compute Engine / Websockets attempt:

Failed: -DDE can't connect to Self Signed Certs, -Compute Engine w/ Node doesn't support Websockets.

Because Google cloud services provides a few free hours each month, we will try to spin up on that platform. As far as we can tell, Google supports WebSockets via the Compute Engine, and on the Flexible version of App Engine. Since the compute engine will support anything, including nodejs, that's probably the best place to start.

A server has been setup generally following this guide: https://medium.com/google-cloud/node-to-google-cloud-compute-engine-in-25-minutes-7188830d884e (note to self: Under Compute Engine, VM instances, when you can't find the SSH link, scroll right.) There were a few minor changes needed:

The end result is a Debian 9.2 "Stretch" based Linux OS and node / express with react server via node reverse proxy. The source file are: https://github.com/ColeMurray/react-express-starter-kit You can add content in multiple places:

Security: It is critical to not just allow a chat server to exist out in the world without encryption and user login; otherwise any sort of human communication /will/ be used to distribute porn. We could try tightly constraining the allowed messages to those that Dexter firmware can support, but that would limit our ability to connect to the job engine. User log in, and therefore encryption is critical to avoid the chat server being used for "bad things". Sadly, decrypting and encrypting data WILL slow down the connection. Perhaps a combination approach can work? Very tightly constrained message types consisting of joint angle / torque data can flow plain text, and other more general user interface items can be encrypted and unrestricted.

To avoid constant re-transmission of passwords, we need to support session level authentication. This seems like a good starting point: https://github.com/Createdd/Writing/blob/master/2017/articles/AuthenticationIntro.md

A test file make_user_db.js shows that the above is working outside express.

The complexity of session management is well explained by this video: https://www.youtube.com/watch?v=OH6Z0dJ_Huk

Finally ready to add web sockets:

And.... web sockets don't work on Compute Engine when using node.js (can use python). But more importantly, you can't connect to the server using a self signed certificate via DDE. At this point, the decision was made to go on to App Engine instead since it provides a valid certificate. It also doesn't support web sockets, but HTTP and BOSC might work.

JamesNewton commented 5 years ago

Because it's faster, and lighter, we are using: https://github.com/websockets/ws instead of Websocket.IO. Also, I couldn't get it to install on Dexter. ,o)

JamesNewton commented 5 years ago

For access to the webcam, this works for still pictures, but is very slow (e.g. 2 frames per second, at 320x200) https://github.com/chuckfairy/node-webcam It requires installation of an external program: sudo apt-get install fswebcam

Part of the issue may be that fswebcam wants to save every frame to hard drive (sdcard in our case). This may help https://github.com/chuckfairy/node-webcam/issues/10 It appears that fswebcam can return the image data to stdout when given a special filename via an option: --save '-' https://github.com/fsphil/fswebcam/issues/11 But initial testing shows that isn't much faster.

JamesNewton commented 5 years ago

Installed the ZeroTier client in Dexter: https://www.zerotier.com/download.shtml via curl -s https://install.zerotier.com/ | sudo bash which reports: *** Success! You are ZeroTier address [ 5f6280cf58 ].

However, it strikes me that this is completely useless because now EVERY Dexter that is put out there will think it's that ZeroTeir address. In fact, doesn't installing ZeroTier need to be re-done on each robot AFTER the image is updated? Or at least, it needs to be triggered to go get a new address. But how? This question has been asked of their community: https://community.zerotier.com/zerotier/pl/s5waqqpxcp8gbfemckgz1odxsy

JamesNewton commented 5 years ago

ZeroTeir says:

If you remove /var/lib/zerotier-one/identity.secret and /var/lib/zerotier-one/identity.public from your base image, a unique identity will be generated the first time zerotier starts.

So that should sort it... before making the image, we can delete those items, then when the image is started in a new robot, and it's connected to the internet, it should go and get a new address.

Started a checklist for making a new image and added this instruction to it: https://github.com/HaddingtonDynamics/Dexter/wiki/SD-Card-Image#checklist-for-making-new-image

slivingston commented 5 years ago

is there any new progress to report about this?

soon I will finish assembling the Dexter HD kit. later this month, I plan to share it via the Internet through a system that I am building, https://rerobots.net/

I am happy to contribute benchmarks, discuss use-cases, etc. so we can explore together good methods for accessing Dexter remotely.

JamesNewton commented 4 years ago

To reduce latency, relay (e.g. chat) servers must be able to "punch through" NAT and allow peer to peer communications. I haven't been able to get Zero Tier to work. It's VERY complex. This seems simple, and would be a good test: https://github.com/SamDecrock/node-tcp-hole-punching

JamesNewton commented 4 years ago

https://docs.husarnet.com/info/ Appears to be doing more or less what we want to do in terms of connecting things across networks. Although they say it always goes peer to peer in other places, in their documentation, they admit that each peer to peer option can fail if the firewall is tight. e.g.

  • First, the Husarnet client connects to the base server (via TCP on port 443 and optionally UDP on port 5582) hosted by Husarion. Husarions runs multiple geographically distributed base servers. Initially the encrypted data is tunnelled via the base server.
  • The devices attempt to connect to local IP addresses (retrieved via the base server). This will succeed if they are in the same network or one of them has public IP address (and UDP is not blocked).
  • The devices attempt to perform NAT traversal assisted by the base server. This will succeed if NAT is not symmetric and UDP is not blocked on the firewall.
  • The devices send multicast discovery to the local network. This will succeed if the devices are on the same network (even if there is no internet connectivity or the base server can't be reached).

TODO: Try this on a Dexter. Note it doesn't support Windows PC's or Node. It needs to be installed in Linux. I still think a websockets version, although slower, will be better in the long run because it supports higher level communication between controlling PCs, which provides access to the resources of the PC.

JamesNewton commented 4 years ago

Jitsi Attempt

Failed:

Jitsi is a full open source, https://github.com/jitsi/jitsi-meet and provides an excellent user experience for video, audio, and text chat. It is, at heart, a node.js based NPM installed service, but it uses other systems for video and audio.1 All of that gets installed2 on a server, which we can do at some point in the future, but do NOT need to do now because: https://meet.jit.si/ is up and running and does NOT require any user account, payment, or other BS to start using.

The goal would be to allow DDE or a Unity app to join a meet.jit.si meeting and then send and receive chat messages with each other.

There is an API for embedding in your own application https://github.com/jitsi/jitsi-meet/blob/master/doc/api.md But I can find no mention of how to send text messages in that API...

Looks like the core of this is UV4L and with a few things added to an rPi, the rPi can create and join Jitsi meetings without a browser, so this IS possible. https://www.linux-projects.org/uv4l/tutorials/jitsi-meet/

Jitsi uses XMPP over BOSH. Apparently we can " join the muc myroomname@conference.mydomain.com". I assume that would be something like roomname@meet.jit.si and we would connect via BOSH and then use XMPP to "sent the message to the participant you want". Apparently, XMPP over BOSH is sort of it's own thing called XEP-0206. I can't find any NPM packages for that, and searching for XMPP BOSH finds a number of packages for a specific service called pubnub which does not appear to be FOSS. The best match I can find seems to be: https://www.npmjs.com/package/strophe.js

Communications: XMPP / Jabber

The specification for XMPP in MUC's (Multi-User Chat) is found at: https://xmpp.org/extensions/xep-0045.html

So what I get from this is that they have a format of: room@service/nick where "nick" is the user name in the room only, not the actual user name, "room" is one specific meeting room or identifier for a video call, and "service" is the host domain.

So if it were "James Newton" in  httpd://meet.jit.si/massmind then they would say massmind@meet.jit.si/James%20Newton It's probably best to avoid nicknames with spaces while we are starting.

They call room@service/nick the JID. I'm guessing that JID is "Jabber ID" since this used to be called Jabber.

"A user enters a room (i.e., becomes an occupant) by sending directed presence to room@service/nick."

"An occupant exits a room by sending presence of type "unavailable" to its current room@service/nick."

Section 6.4 is how to ask a room what features it supports. We want to make sure it's not private or requiring a login. This probably isn't required, at least not at first.

Section 7.2.1 shows how to join a room:

<presence
    from='hag66@shakespeare.lit/pda'
    id='n13mt3l'
    to='coven@chat.shakespeare.lit/thirdwitch'>
  <x xmlns='http://jabber.org/protocol/muc'/>
</presence>

So the <x xmlns='http://jabber.org/protocol/muc'/> just says "I'm speaking your language". the to= appears to be talking about room@service/user which I think means this is a user known as "hag66@shakespeare.lit/pda" wants to be known as "thirdwitch" joining a MUC in a room called "coven" on the "chat.shakespeare.lit" server. Apparently the from field is required? But on Jitsi there is no user signup, so do we just make that up? I think the id is just a random string to make sure messages don't get misrouted. 

Section 7.5 is sending a private message to another user in the MUC

<message
    from='wiccarocks@shakespeare.lit/laptop'
    id='hgn27af1'
    to='coven@chat.shakespeare.lit/firstwitch'
    type='chat'>
  <body>I'll give thee a wind.</body>
  <x xmlns='http://jabber.org/protocol/muc#user' />
</message>

So here, it appears that the from= is the users real id, but the server will replace it with the nick "thirdwitch" (which they had used to join the room) before sending the message on to the to= user. The to= starts with the current room/server and the nick for the intended destination just gets added to that. So then the message gets sent to the nick "firstwitch" which the server happens to know is actually crone1@shakespear.lit/desktop. 

<message
    from='coven@chat.shakespeare.lit/secondwitch'
    id='hgn27af1'
    to='crone1@shakespeare.lit/desktop'
    type='chat'>
  <body>I'll give thee a wind.</body>
  <x xmlns='http://jabber.org/protocol/muc#user' />
</message>

It's possible that this is all we need to know about the room system.

Connection: BOSH / Strophe / meet.jit.si

The connection part of the puzzle has been much harder to figure out. BOSH is relatively easy to understand, but the documentation for Strophe is massive, but nearly devoid of examples, and there is ZERO documentation on how to connect to meet.jit.si. An issue has been raised: https://github.com/jitsi/jitsi-meet/issues/5559

Requests for help in their community have been met with cryptic and terse responses: https://community.jitsi.org/t/inserting-locally-connected-device-data-into-chat/24118/3 a new, more forceful, request has been made: https://community.jitsi.org/t/authentication-fail-on-connect-but-jitsi-doesnt-require-authorization/28566

At this point, When we run the code below, we get

1 CONNECTING 3 AUTHENTICATING 2 CONNFAIL 6 DISCONNECTED

var conn = new Strophe.Connection("https://meet.jit.si/http-bind")
conn.connect('massmind@meet.jit.si/cfry', //fake jid, that's ok right?
             "", //don't need password?
             function(status_code) { //status_code is a small non-neg integer
                   console.log("connected with status: " + status_code + " " + strophe_status_code_to_name(status_code))
             }) 

So an authentication failure? This is very confusing since jitsi doesn't require any authentication...

In the end, if we can't figure out how to make the connection, we will need to just move on and find some other way to do this. It's a crying shame as Jitzi being FOSS is just taylor made!

JamesNewton commented 4 years ago

App Engine attempt

Status: Working via BOSH

This is a pretty standard node / express server:

const express = require('express')
const http = require('http')
const WebSocket = require('ws') //https://github.com/websockets/ws
const bodyParser = require( "body-parser") //parse post body adata
const bcryptjs = require('bcryptjs') //to encrypt passwords for storage

const app = express()
const PORT = 8080

For node.js on the app engine, there are two environments: Standard and Flexible. https://cloud.google.com/appengine/docs/nodejs https://cloud.google.com/appengine/docs/the-appengine-environments To reduce ongoing cost of operation, we are starting with the standard environment. Sadly, that does not support websockets, which would be very easy to use, but, luckily Jitsi taught us about BOSC

https://en.wikipedia.org/wiki/BOSH_(protocol) BOSH supports server push notifications by simply not responding to a request from the client until the server has something to send to the client. The request simply hangs until data arrives (via another connection to the server from the sender) or until it times out. In either case, it is the clients responsibility to re-establish the request as quickly as possible and so continue listening for data from the server, or more accurately from a sender via the server.

The other issue with the Standard Environment is that it can be shut down and started up at Googles will. This makes keeping session state interesting. You can log in, get a cookie, and if that cookie was stored in a local RAM session store, it can be forgotten at any moment. Luckily, google/@datastore supports using a Cloud Firestore in Datastore mode as the store.

const {Datastore} = require('@google-cloud/datastore');
const session = require('express-session') //session mgmt
const DatastoreStore = require('@google-cloud/connect-datastore')(session);
const data_store = new Datastore({
      // @google-cloud/datastore looks for GCLOUD_PROJECT env var. Or pass a project ID here:
      // projectId: process.env.GCLOUD_PROJECT,
      projectId: APP_ENGINE_PROJECT_ID,
      // @google-cloud/datastore looks for GOOGLE_APPLICATION_CREDENTIALS env var. Or pass path to your key file here:
      keyFilename: process.env.GOOGLE_APPLICATION_CREDENTIALS
    })

//Session setup seperate sessionParser so 
app.use(session({
  store: new DatastoreStore({
    kind: 'express-sessions', 
    // Optional: expire session after exp milliseconds. 0 means do not expire
    // note: datastore doesnt auto del expired sessions. Run separate cleanup req's to remove expired sessions
    expirationMs: 0,
    dataset: data_store
  }),
  resave: false,
  saveUninitialized: false,
  secret: 'Like id tell you'
}));

The only problem with that is that cookies are perminant and do not expire, even if you tell them to. It's necessary to manually clear out the

//just serve up static files if they exist
app.use(express.static('public')) 
app.use(bodyParser.json())
app.use(bodyParser.urlencoded({extended: true}))

/* Authentication section removed, it basically just sets a user name
req.session.user = username
*/

Rather than use available BOSH packages (which were poorly document and very confusing) a microscopic version of BOSH was included. For this service, we can be relatively certain that the engine will not be shut down, because it's always pending a BOSHout request. As long as one listener is listening, google doesn't have time to shut down the engine and therefore drop the boshs array.

Operation: BOSHout requests come in, get added to the boshs array, and then nothing else is done. There is no reply, no error, no nothing. This causes the socket to stay open, as the client waits for the server to respond. Next, a BOSHin request comes in, with a "to" parameter for the user who made the BOSHout request. Since that user is found in the boshs array, the socket response object can be found, and the message sent to that client. The entry in the boshs array is then deleted so it won't be found by another BOSHin until the BOSHout client re-establishes the connection.

//MicroBOSH
var boshs = {}
BOSHtimeout = 30000 //in milliseconds

app.get('/BOSHout', function (req, res, next) { 
  console.log("BOSHout,  user:"+req.session.user)
  if (!req.session.user) {
    var err = new Error('FAIL: Login')
    err.status = 403
    next(err) //skips to error handler. 
    }
  boshs[req.session.user]={}
  boshs[req.session.user].res=res
  boshs[req.session.user].int=setTimeout(function(){ 
        //console.log('BOSHout timeout '+req.session.user)
        res.status(408)
        res.send('BOSHout timeout')
        res.end()
        delete boshs[req.session.user]
        return
    }, BOSHtimeout)
//dont reply or end. 
return
})

app.get('/BOSHin', function(req,res, next) {
    //Check that the requestor is logged in. 
    if (!req.session.user) {
        var err = new Error('FAIL: Login')
        err.status = 403
        next(err) //skips to error handler. 
        }
    let msg = req.query.msg || "none"
    let to = req.query.to
    let bohc = {}
    //parameter is the session.user to send the message to
    if (to) { //console.log("to "+to)
        if ( boshs[to]) { console.log("found")
            bosh = boshs[to] //lookup session id from username
            }
        }
    bosh.stat = "unknown"
    if (bosh.int) { //console.log("clearing timeout")
        clearInterval(bosh.int)
        bosh.int=undefined
        }
    if (bosh.res) { //console.log("responding")
        bosh.res.setHeader("from_user", req.session.user)
        bosh.res.send('BOSHin message:' + msg) //don't change w/o Fry.
        bosh.res.end()
        bosh.stat = bosh.res.headersSent ? "good" : "bad" 
        //https://expressjs.com/en/4x/api.html#res.headersSent
        //bosh.res = undefined //make sure we know this response object is used
        delete boshs[to]
        }
    console.log("to:"+req.query.to+". stat:"+bosh.stat+". ")
    res.send('BOSHin to:' +req.query.to + ". Status:" + bosh.stat) 
    res.end()
})

If the user requested in the BOSHin is not found in the boshs array, "unknown" is returned to the BOSHin client. Otherwise, "good" is returned.

The rest of the code is just the standard node / express setup:

// If we get here, the file wasn't found. Catch 404 and forward to error handler
app.use(function (req, res, next) {
  var err = new Error('File Not Found');
  err.status = 404;
  next(err);
});

// Error handler define as the last app.use callback
app.use(function (err, req, res, next) {
  res.status(err.status || 500)
  res.send(err.message)
  })

app.listen(PORT)

Support for this server is being added to DDE via the Messaging class.

Testing shows throughput rates of 50ms and latency of 80ms (which is amazing) to 300ms under normal use.

All code archived here: https://drive.google.com/drive/u/0/folders/18OgYsn8LLy1IkCCo-7XHSYZBMB5R-nhN

You can see the result here: https://www.youtube.com/watch?v=xfD2V_AaFQY (This is without latency compensation of any type. We got 60ms round trips regularly, with as much as 500ms delays from time to time. The major issue was actually the compression and transmission of the return video. Reducing it's resolution and framerate greatly improved teleoperation.)

NOTE: THIS IS VERY LIKELY TO CHANGE!

TODO: Fix issues with session cookies lasting forever

TODO: Add a means of registering new users and ensuring they are Dexter users.

FAIL: Try UDP as a faster option, when the router allows datagrams to NAT back. Although it looked like this was possible, it appears it is NOT on the standard app-engine. They don't explicitly say you can't for node.js (there is no page about it for node) but they DO say you can not accept inbound connections on the python page, and can no longer do outbound. 1 If we want to do UDP, we need a REAL server with control of the firewall.

TODO: Transition to Flexible environment or a real server and re-enable websockets.

JamesNewton commented 4 years ago

Mosh / SSH -R

SSH has some interesting command line options.


     -T      Disable pseudo-terminal allocation.

     -R [bind_address:]port:host:hostport
     -R [bind_address:]port:local_socket
     -R remote_socket:host:hostport
     -R remote_socket:local_socket
             Specifies that connections to the given TCP port or Unix socket
             on the remote (server) host are to be forwarded to the given host
             and port, or Unix socket, on the local side.  This works by allo‐
             cating a socket to listen to either a TCP port or to a Unix
             socket on the remote side.  Whenever a connection is made to this
             port or Unix socket, the connection is forwarded over the secure
             channel, and a connection is made to either host port hostport,
             or local_socket, from the local machine.

             Port forwardings can also be specified in the configuration file.
             Privileged ports can be forwarded only when logging in as root on
             the remote machine.  IPv6 addresses can be specified by enclosing
             the address in square brackets.

             By default, TCP listening sockets on the server will be bound to
             the loopback interface only.  This may be overridden by specify‐
             ing a bind_address.  An empty bind_address, or the address ‘*’,
             indicates that the remote socket should listen on all interfaces.
             Specifying a remote bind_address will only succeed if the
             server's GatewayPorts option is enabled (see sshd_config(5)).

             If the port argument is ‘0’, the listen port will be dynamically
             allocated on the server and reported to the client at run time.
             When used together with -O forward the allocated port will be
             printed to the standard output.

     -L [bind_address:]port:host:hostport
     -L [bind_address:]port:remote_socket
     -L local_socket:host:hostport
     -L local_socket:remote_socket
             Specifies that connections to the given TCP port or Unix socket
             on the local (client) host are to be forwarded to the given host
             and port, or Unix socket, on the remote side.  This works by
             allocating a socket to listen to either a TCP port on the local
             side, optionally bound to the specified bind_address, or to a
             Unix socket.  Whenever a connection is made to the local port or
             socket, the connection is forwarded over the secure channel, and
             a connection is made to either host port hostport, or the Unix
             socket remote_socket, from the remote machine.

             Port forwardings can also be specified in the configuration file.
             Only the superuser can forward privileged ports.  IPv6 addresses
             can be specified by enclosing the address in square brackets.

             By default, the local port is bound in accordance with the
             GatewayPorts setting.  However, an explicit bind_address may be
             used to bind the connection to a specific address.  The
             bind_address of “localhost” indicates that the listening port be
             bound for local use only, while an empty address or ‘*’ indicates
             that the port should be available from all interfaces.

So you can SSH in from A to B, but send messages back from B to A. And you can restrict the resulting login to use only certain ports, and not support running commands.

https://mosh.org/ Uses a system like that to connect.

This works over UDP, so it may not always work. Although, SSH generally works, so the cases where it fails are probably quite rare.

And it still needs a server, because the connection must be initiated from the clients in all cases. And the server must be raw metal. And the server firewall must allow incoming UDP.

JamesNewton commented 3 years ago

Kamino cloned this issue to HaddingtonDynamics/OCADO