What is secure and safe enough?

peterbrittain commented 11 years ago

I was having a play at setting up the central server last night and have created a simple django app that:

Has an authentication model - staff and users
Allows you to create staff through the django admin UI (using the superuser)
Allows you to create/delete users through JSON as a way to register/deregister client Pis
Allows staff to login to the (contentless) conga GUI

So far, so good: we can have administrators of the service (staff) and normal users, each with their own dedicated interface (HTML/GUI or JSON).

The problem I'm hitting is the question of what is secure enough for something that we propose to allow schools to use and yet will be running "in the wild" on the Internet.

Some simple questions for what we need to consider:

Do we need to worry about malicious attacks - e.g. DoS, etc.?
Do we need to support https - e.g. to protect authentication details?
Do we need to police the members of every conga so that banned users cannot attempt to join other congas?
Do we need to try to spot banned users attempting to set up as another user/systems registering with mulitple identities?
Do we need to sign every single message that goes through a conga, or can we hand off such concerns to the Pi clients once they have been arranged into a conga?
Do we need to monitor the user names and messages for inappropriate content?

My gut reaction is that we can't make a super secure service without a lot of effort and so probably should accept some limitations. It's probably enough to make this system resilient to mild scrutiny. In particular, I'd propose:

We don't use HTTPS (as that is a massive drain on the server).
We insist that all Internet congas are set up through the server - which means all clients register and only accept modifications to the conga from the server.
The server can ban clients based on username/MAC address (but that relies on the client publishing its MAC address in the registration)
The server prevents duplicate MAC addresses.
The server does nothing to moderate usernames/messages. Staff can ban usernames as they spot inappropriate content.

Make sense?

Lukasa commented 11 years ago

I agree on some points and disagree on others.

About HTTPS, I disagree: I believe HTTPS is obligatory. I really can't be more emphatic about this. The performance hit is just not a problem anymore. In particular, we should be able to avoid problems caused by lots of SSL handshakes because Requests uses connection pooling.

We don't need to HTTPS everything: restricting ourselves to HTTPSing the Web UI and initial Pi registration is potentially enough, we can use session tokens after that point.
We have to be prepared for DoS, even though it's not likely to be a problem. Ideally we'll have a system where we can spin up a new webserver fairly quickly if we need it.
Agree that all internet congas should go through the server.
Publishing MAC addresses in registration isn't a problem if we have HTTPS, so I'm all for that.
Agree on staff moderation as well.

Some other thoughts:

We can potentially mitigate some of the DoS problems by whitelisting MAC addresses. If we have all the MAC addresses of the Pis we sent out beforehand we can simply limit registration to those Pis.

peterbrittain commented 11 years ago

Fair enough. If our server can cope with the SSL, I'd be much happier using it. For some reason, though, I thought we might be running the server on a Pi and so might have an issue here. For now, let's assume we use SSL until it becomes an issue.

We might be able to do something more cunning on a DoS attack, but for now, I'll assume that it's a matter of getting another server up and running as quickly as possible. Let's get tha basics sorted before adding bells and whistles like white-listing...

peterbrittain commented 11 years ago

I've just got my first Pi installed and tried installing our server code on it. Good news is that it all just worked out of the box (wiki has been updated to explain how to do it). The bad news is that I completely blow all the CPU on the Pi trying to run the server with just my simple test script and no SSL.

Looks like we'll either need to back off the SSL everywhere idea and be more selective on when we use it, or ditch the original idea of running the server on RPis too.

Lukasa commented 11 years ago

Yeah, my strong view on using SSL was predicated on the idea that we'd use a non-Pi server. If we do then SSL is way less likely to work.

Lukasa commented 11 years ago

Actually, as a further thought, using a Raspberry Pi for a server is just almost certainly going to go poorly. I don't think we can expect one Raspberry Pi to take something in the region of 300-700 times the load of each of the spoke Pis: at least, not if we're writing the server in Python.

If we were really determined to use Raspberry Pis everywhere we could use a cluster of Pis as a 'server', which is an interesting engineering challenge, but not necessarily a good decision.

peterbrittain commented 11 years ago

I think Neil's idea was that you could use the server as a local hub and so it would be servicing far fewer requests. Based on what I'm seeing, I'm not convinced it would be capable of handling even 10 clients in that case (even without SSL).

That said, we have yet to set up a production server, so use of Apache + postgresql (or favoured Pi equivalents) might make the difference for such small scale deployments.

Fortunately, we have already proved that we can run the server without any issue on pretty much any Windows or Linux PC, so maybe the teacher installs that on their PC?

Lukasa commented 11 years ago

Wait, were you using the Django builtin server for this testing?

peterbrittain commented 11 years ago

Yup - and I knew it will be slower as a result... But this was REALLY slow.

With effort, we may be able to get it working for very small deployments, but it is clearly not going to work in all cases. Hence why I've re-opened this issue.

I think the resolution to this trail is to set up a proper deployment server and then seeing how far it can actually cope with real requests. Volunteers anyone?

Lukasa commented 11 years ago

Django's development server isn't just slow, it's single-threaded. That probably accounts for part of your problem. I've got some experience setting up Django with Gunicorn, so I can give that a shot.

peterbrittain commented 11 years ago

My client was single-threaded too - it just issued one request after the other and spent up to ~10-20 seconds waiting for some to return before issuing the next request. The whole time the request was being processed, the Pi was running at 100% CPU.

Lukasa commented 11 years ago

So I've put Gunicorn in front of the Django test app and run our test script against it from my Windows box. I also updated the test script to print out the time each request took (change checked in). Same LAN, so almost all the request time here is on the Pi. Highlights are:

Registration takes between 7 and 11 seconds.
Anything else that touches the user database takes three seconds (basically).
Everything else (includes creating and joining congas, reading users, and deleting congas) is less than a second.

The cost here is almost certainly going to the SD card. Disk access is always going to be a pain, and the SD card is JUST SO SLOW. I might try putting a networked database behind the app to see if it goes faster.

peterbrittain commented 11 years ago

Could be that sqlite is not caching requests to access its DB file... It's certainly worth setting up a real DB server at some point.

However, I don't see the benefit of running it on a separate server from the Pi, though, coz that effectively means you have to set up that separate server for the DB. And if we're going to do that, we may as well insist that the web server goes there too.

Lukasa commented 11 years ago

Yeah, I'm inclined to blame sqlite. At @lwr20's suggestion I moved the db file into a ramdisk on the Pi. This removed some of the variance in request times, and dropped the time of anything reading the DB to basically zero. Registration still took 7 seconds, re-registration took 3, and de-registration took 3. Everything else took less than a second.

At this stage it looks like sqlite is to blame here, though I'm not used to it being that slow.

Lukasa commented 11 years ago

Oh, maybe not. Watching the output of 'top' the gunicorn process was maxed out. We might need to profile the django code to see what's going on.

peterbrittain commented 11 years ago

The one real difference between user registration and the rest of the API is that it authenticates the user. I bet that's using a monstrously intensive encryption and so we need to pick a less intensive algorithm.

Lukasa commented 11 years ago

So with some testing done here that appears to be correct. I've switched the Django config to prefer SHA1 which vastly improves our timings.

peterbrittain commented 11 years ago

OK - let's close this one down again now. We can run a very small network on a Pi if we have to, but expect the public network to use a cloud server for the required time.

neilcollins / piconga

What is secure and safe enough? #6