Test for SSL connection/certificate and handle otherwise

fnoop commented 4 years ago

There are various connections made to a maverick computer:

http/s for the website, on standard ports 80 and 443
maverick-discovery, currently non-ssl websockets on port 1234 but should be ssl
maverick-api, standard system instance on http and websockets on ssl port 6700
webrtc (janus) websocket transport on port 6796

If the main website is loaded over http, without loading the self-cert CA into the os/browser, then the various encrypted ports are not available. ssl is required for webrtc but not currently any of the other services.

If the main website is loaded over https and the CA cert is loaded correctly, then the unencrypted ports will not work.

The solutions are:

Run everything unencrypted by default, and take the user through the CA cert loading process for webrtc, or show a warning.
Run everything through ssl, and take the user through the CA cert loading process.
Run everything through ssl, and fix the certs so they can be accepted with a warning (currently the certs don't allow that).

All of these should be tested.

fnoop commented 4 years ago

If the CA cert isn't loaded then the browser can't connect with a Net::ERR_CERT_INVALID.

Ideally we would trap this and take the user through the loading process, but seems that's not possible directly: https://stackoverflow.com/questions/28556398/how-to-catch-neterr-connection-refused

fnoop commented 4 years ago

https://stackoverflow.com/questions/2618944/check-in-javascript-if-a-ssl-certificate-is-valid/25077676#25077676 https://stackoverflow.com/questions/45067892/xmlhttprequest-onerror-handler-use-case https://javascript.info/websocket

fnoop commented 4 years ago

@SamuelDudley - It turns out to be pretty much impossible to detect the state of SSL in the browser, or to detect a bad ssl cert etc. The best we can do is workarounds like detailed above, by firing image or http requests and responding to an error event. However that error event could simply be that the service involved isn't running, or there's a dns/mdns/zeroconf problem and you can't reach it by hostname.

I think the sensible thing to do in the short term, for an initial 1.2 release, is to:

move all fixed services (website, -discovery, -api@system, visiond, webrtc) to both encrypted and encrypted ports simultaneously, and default everything to unencrypted. That way we can get all of the basics working without pain to the user.
webrtc video itself won't work without ssl, so we'll have to present this to the user somehow and get them to upgrade, leading to..
if we need/want to point the browser any SSL service (eg. webrtc), then we have to move all services to ssl, as that is what browsers demand. So we should continue down the road of building detection and a nice interface that walks the user through installing the Maverick CA cert for that host, and move the entire thing to SSL.
Over time, once we're happy with the process of upgrading the browser to SSL, we should move to SSL by default.

Does that sound sensible plan?

SamuelDudley commented 4 years ago

Yep, sounds like a plan. As we are now (potentially) serving two of everything (one with SSL and one without) are there useful assumptions we can bake into -web, and in particular -discovery? e.g.

location.protocol==='http:' -web should talk to ws -discovery by default and only expose non-ssl services (and ideally have the video tab disabled)
location.protocol==='https:' -web should talk to wss -discovery and expose only ssl services

using the https interface makes the assumption that SSL is configured correctly.. perhaps we can serve the https content from a separate endpoint of nginx for now say /secure/ and keep the http version of the site at /?

SamuelDudley commented 4 years ago

we can then force the content at /secure/ from http to https via nginx, likewise we can force http at / .

My reasoning behind this is that we want to make it very clear there is a distinct difference between what is happening in the background of the two services. If a user connects to one or the other they have no choice how they are doing it. Hopefully this will make the whole experience more predictable for the user and us (if we need to help them).

fnoop commented 4 years ago

Yep, sounds like a plan. As we are now (potentially) serving two of everything (one with SSL and one without) are there useful assumptions we can bake into -web, and in particular -discovery?

I think not only should we, we have to as if you're connected over https then you have to do all websocket connections over ssl no matter the target. This is complicated however by -web being able to connect to multiple -discovery, -api and webrtc instances across multiple systems. Just because you can connect to the website over https doesn't meant that you can connect to the other systems over https and therefore wss. So the interface will need to trap and handle all ssl connection problems to whatever -api or other connections have been defined, and walk the user through setting up ssl for each system.

fnoop commented 4 years ago

Do we handle SSL through tornado multi-binding discrete sockets for ssl/nonssl, or do we only do nonssl in tornado and rely on nginx as an ssl proxy terminator?

I'm probably in favour of tornado - it's more controllable and less reliant on yet another major component in the system (nginx)?

SamuelDudley commented 4 years ago

But there will be both a ws and wss discovery, each on a known port. https -web will know what port to look for and we can add a ssl=True/False flag to each of the zeroconf services. This will allow discovery over ws/wss to filter available services accordingly. This means only ssl services are exposed via wss discovery to https web. Sure, it's then possible to manually add an end point that is incorrect, but it is going to be a deliberate action.

I think handle https web and http web with nginx and spawn additional services with seperate ssl vs non ssl configurations for api, discovery, etc.

SamuelDudley commented 4 years ago

I'm not sure if you were suggesting this but I'm not particularly keen on the idea of having both a ssl enabled server and a non ssl server running from the same python process (e.g. two api servers running within the same python program). To me it feel like it's something that should be managed at a discreet systemd service level.

fnoop commented 4 years ago

I'm not sure if you were suggesting this but I'm not particularly keen on the idea of having both a ssl enabled server and a non ssl server running from the same python process (e.g. two api servers running within the same python program). To me it feel like it's something that should be managed at a discreet systemd service level.

Yes that's what I had in mind - what's wrong with this?

I'd like to avoid a sprawling mess of services and instances which is what this is quickly turning into! In my mind simplest is always best. I think at least at the start we have to run discrete ssl and non-ssl ports for each service all the time. If we don't run non-ssl then normal new users won't be able to connect to anything without jumping through the SSL hoop for each and every connection/system they're connecting to. And if they choose to do that for one system then they have to do it for all of them.

To make it simplest and easiest for the user out of the box, I would have thought that we setup:

SSL and Non-SSL ports for each service
-discovery responds to both ssl and non-ssl, always
each service broadcasts their ssl and non-ssl ports through zeroconf, which -discovery picks up and forwards through websockets to -web
-web uses non-ssl by default for all connections, so they just work automagically out of the box. when they're ready to change to ssl, they already have the ssl endpoints setup (previously braodcast through -discovery) and - web can talk them through the process of upgrading to ssl.

This is a hard problem and will be tricky to solve, but I feel that we should do it led by the experience of the naive user?

SamuelDudley commented 4 years ago

Agreed this is a tricky one... The mess will still be there, its just a layer deeper :P. Separate ports will still be required and common things like logging and database access will be shared by default and all settings that specify ports such as mavlink connections will need to be duplicated adding to additional application maintenance + complexity for what I'm hoping will not be a long term problem. IMO adding / removing a service is much cleaner way of splitting the two and minimizes long term ramifications. I also really don't like the idea of discovery knowingly adding services to -web that will not work given how the page is being loaded (e.g. ssl services being added to http web, the user will try to activate them... The simplest solution for the problem I can come up with is have one set of services that are all non-ssl by default (as per your suggestion) and have a config switch in localconf.json that makes all the required services (currently only -api and -discovery) move to https.

SamuelDudley commented 4 years ago

I assume one of your concerns is that if a user has more than one vehicle in a fleet, by moving to https on one they then will need to move to https on all of them. Your suggestion above does get around this requirement but I would suggest at a pretty sizable development cost.

fnoop commented 4 years ago

OK to having separate services for ssl/non-ssl - what you say makes sense technically.

The simplest solution for the problem I can come up with is have one set of services that are all non-ssl by default (as per your suggestion) and have a config switch in localconf.json that makes all the required services (currently only -api and -discovery) move to https.

I don't think that any interaction/configuration by the end-user is a good thing. Whatever we come up with will be a deeply technical system/process that will have to be well tested, and the easiest way to do that is by limiting the amount of choices and permutations to the absolute minimum.

I also really don't like the idea of discovery knowingly adding services to -web that will not work given how the page is being loaded (e.g. ssl services being added to http web, the user will try to activate them...

But I think -web logic will make these decisions, not the user. What drives all of this is whether the user chooses to make the main website http or https - that drives all further decisions (technically) as to which ssl/non-ssl endpoints are used. So if the user decides that he/she/it wants to use a video stream, by definition he/she/it has to then load the main website as https. By doing this they are mandating that all defined connections - -discovery, -api and webrtc must all immediately be switched to ssl. The -web interface should take care of all of this automatically from previously -discovered endpoints without any knowledge or intervention from the user, other than where absolutely necessary - ie. the certs.

Following that logic, the only way to me to make this as seamless to the user as possible is to create and activate ssl and non-ssl instances of all services from the beginning so they are always available, and broadcast them through -discovery, and have -web immediately ready to switch back and forth between ssl and non-ssl as necessary (eg after setting everything up for ssl and using ssl services, the user might fire up their browser the next morning and load -web through http).

Then it's up to us in -web to detect and guide the user through this minefield as clearly and simply as possible, which shouldn't be too much work. Does that make sense to you?

It's either that, or from the start we just to SSL only for all services except for -web (we need http for -web so we can guide the user through setting up ssl). That does simplify the architecture an awful lot, at the expense of a little more initial difficulty to the user.

SamuelDudley commented 4 years ago

OK to having separate services for ssl/non-ssl - what you say makes sense technically.

-api is going to be the biggest change... let me experiment with what will need to be modified before we commit one way or the other. The abstraction required for the logging, etc might no be too bad / too much work.

I don't think that any interaction/configuration by the end-user is a good thing. Whatever we come up with will be a deeply technical system/process that will have to be well tested, and the easiest way to do that is by limiting the amount of choices and permutations to the absolute minimum.

Okay, I agree with that point... Ultimately we have control of the configuration so we should use that to our advantage.

activate ssl and non-ssl instances of all services

Just so we are clear do you agree that currently only -discovery and -api need to support this dual mode of operation? visiond talks to webrtc via ws and that wont need to change and wrbrtc is always ssl only, correct?

have -web immediately ready to switch back and forth between ssl and non-ssl as necessary

As long as we treat the services in -web as ssl/non-ssl pairs and changes to user defined descriptions, names, etc are reflected but hidden from the user until -web is loaded over http/https I am okay with that plan.

It's either that, or from the start we just to SSL only for all services except for -web

I think we have agreed in the past that it will be too much of a barrier for interested but not highly committed users. Once we can offer more features in -web that don't technically require ssl, such as setup, planning, cockpit and analysis the lower barrier to entry will be highly useful IMO. Currently it seems a bit one sided for ssl because its needed for webrtc which makes up 90% of the current -web feature base!

As an aside, I will personally want a ssl only mode, which is something that I will advocate for in the future. Such a mode would enable nginx to redirect to https always and switch off all the http / ws services running on the system... but that can come in the future.

cglusky commented 4 years ago

Is it possible to ship a default set of certs and just make everything https/wss. Just publish the cert and keys and whatever required. Cant be worse than http. Then just doc and warn that certs need to be replaced asap. If you are installing without internet your surface area for attack is the LAN right? And someone with their password on a yellow sticky is more likely than someone sniffing and mitm for a lan without internet?

fnoop commented 4 years ago

Is it possible to ship a default set of certs and just make everything https/wss. Just publish the cert and keys and whatever required. Cant be worse than http. Then just doc and warn that certs need to be replaced asap. If you are installing without internet your surface area for attack is the LAN right? And someone with their password on a yellow sticky is more likely than someone sniffing and mitm for a lan without internet?

Not possible, unfortunately. You can't generate .local certs that are automatically trusted by browsers.

cglusky commented 4 years ago

Edit. Sorry. Likely not important.

fnoop commented 4 years ago

And visiond no longer works without webrtc which is the only thing that will not work without ssl? Or are there more services?

No it's a very important question. visiond still works because that's separate to -web - it's about streaming video. It's setup to serve rtsp by default, so you can load it in VLC or in your favourite GCS. We also use it in -web through webrtc, and webrtc mandates SSL. But it's not the only thing - for example PWA also mandates SSL as do most of the other new funky web technologies like http2, geolocation, and service workers. Going forward we'll certainly want to use most of these, and pwa/service workers probably in the near term as soon as we start adding GCS components.

fnoop commented 4 years ago

One option is to tell the client to set browser option: chrome://flags/#unsafely-treat-insecure-origin-as-secure

This should work in any chrome-backed browser.

SamuelDudley commented 4 years ago

Re trying to access a https tornado service via http. We could define a custom error code and have it detected in -web?

SamuelDudley commented 4 years ago

I was also thinking if we could create a https to http proxy in tornado and direct https traffic to the http server locally (to possibly get around running two complete server instances)

(Full of bad ideas this evening!)

fnoop commented 4 years ago

-api now broadcasts full set of ssl endpoints over zeroconf. -discovery picks these up and adds to internal cache. -web tests to see if it's loaded over https (and thus has a cert loaded), if so it connects to -discovery over wss, and for each -api client found in broadcast creates https/wss connections to the -api instance. If -web isn't fully configured for wss but loaded over https, then doesn't create the -api client and outputs error.

fnoop commented 4 years ago

Loading -web over either http or https works well now - it discovers and connects to -api over http/https/ws/wss as necessary.

Further actions:

Detect if -web loaded over http and if https can be loaded cleanly (ie cert installed). If not, display a warning and guide the user through installing the cert.
Detect if remote -api instances defined are available over ssl, and if not then guide the user through installing the cert for that system.

fnoop commented 4 years ago

Use a one-pixel transparent element-less (DOM) image load in the background to test for SSL state. If it doesn't check out, then don't carry on with the client if -web loaded over https. Seems to work well, probably as good as we're going to get given that it's deliberately impossible to properly test SSL states through the browser.

Now just need to add a notification and popup interface to guide the user through downloading and installing missing certs.

SamuelDudley commented 4 years ago

It's crazy that is the best method we have for testing SSL! I guess we continue to operate on the edge of normality with this project so we will continue to hit these pain points... Looks like a nice fix :)

fnoop commented 4 years ago

Using https://www.npmjs.com/package/vue-browser-detect-plugin to detect OS/browser and display an SSL CA cert install guide to the user, for each API that is detected as not having working SSL. Firefox uses the simplest process across all browsers, and is probably the browser to recommend to users (as well as having consistent interface across widest choice of OS, fully opensource etc).

Chrome and Safari on MacOS both use the system Keychain app, which is a lot more complicated to install a cert into.

Linux and Windows need guides created.

fnoop commented 4 years ago

Homepage ssl test/upgrade added, video component checks for ssl. I think we're mostly done now except for filling in the instructions for different OS/browser combinations.

fnoop commented 4 years ago

goodrobots / maverick-web

Test for SSL connection/certificate and handle otherwise #173