zacbraddy / scout-waiting-list

An app to administer the waiting list of a local scout group
MIT License
1 stars 1 forks source link

ScoutID is a specific value from our Spreadsheet #13

Open thisisthetechie opened 4 years ago

thisisthetechie commented 4 years ago

I haven't tried any dev on this until just now, obiously.

ScoutID is an MD5 hash of the Scouts name, generated in our Spreadsheet.

This will be sent out to the parent/guardian when they join the list, so this needs to be the same on the waiting list.

zacbraddy commented 4 years ago

Before I get down to an answer, I've edited your issue to remove the image of my son's name. Please remember this is public website and a public repo so we do have to be careful with sensitive information.

I can certainly make a change to allow you edit the id however this would be another plain text field that you'd have to fill in rather than being generated as the current ids are. The only question would be, do you want to keep the current id's visible or would you prefer I hide the current id and have the id that you're planning on filling in yourself be the only one visible.

Also there is a question of how you want collision of those id's handled. Are you expecting that there would be some validation to ensure you can't use the same id for different people? Or are you happy that you would be able to maintain that yourself without the app having to check this?

Final point, I'd be careful using an MD5 hash based on GDPR sensitive material MD5 is hideously insecure. I know you aren't sending these things out to super nefarious or even totally computer literate people but if the plan is to put these things on the internet then you are opening those hashes up to the public. Unless my understanding is incorrect, someone who was interested in the data could then just look for a collision to attempt to reverse the encryption.

One benefit of sticking with the id's generated by firebase that you currently have is that they are based on absolutely nothing that could be traced back to the kids at all.

thisisthetechie commented 4 years ago

Sorry about that, used to working in Private Repo's.

Part of the plan is to have the parent/guardian automatically emailed their ID when the ID is generated, that's not possible if it's being generated on your side (we'd have to import the key back into the spreadsheet).

That said, you're right about MD5 - so it's now a secure key subset. We have some visual information in place if the ID is not unique, the chances of collision are low I think, but we can certainly identify them prior to writing to your app.

If you can hide your ID and show ours, that would be cool.

zacbraddy commented 4 years ago

I was more talking about collision by the user (you or Nige) accidentally typing in the same one twice, more validation for honest mistakes rather than as an attack vector on the app. The chances of MD5 collision just from generating them are pretty low we can agree on that.

But just so we're clear, collision of the keys is not specifically the thing I'm saying is the reason for MD5's being published to the web is a bad idea. I'm not worried about the app breaking, what I'm worried about is that if you can find a collision then in theory you could use that to brute force out the kids name.

With MD5's being so easily crackable, to the right people, you might as well be publishing the kids names directly to the internet. That style of encryption is going to do basically nothing to protect them if the MD5's are in the hands of the wrong people. This is because if the knowledge of the fact that those MD5's are based on names then you have a significantly smaller dictionary to work with in which you can use for the purposes of brute forcing the name.

You see, I'm assuming that previously you sending out these MD5's meant sending them privately to parents and them checking them against a link shared google spreadsheet. For these purposes MD5's are fine because they aren't being published publicly to the internet. If I add a column where you fill these in then obviously they will start getting published publicly. So I think investing in finding some way for us to sync these id's up is in our best interests and the interests of the kids and parents.

So I guess the direct answer is that I can add a column where you can the MD5's where they'll be displayed but I think it's not a super great idea.

If you were SUPER married to keeping the MD5's and there was no other option then we could make it so that we don't have an open front end. I could make it so that the parents would have to log in to see the public facing list. This would mean that the audience for the list would be reduced to just the parents who's kids were on the list rather than the wider internet. That wouldn't make the problem completely go away, you still my get a MR.ROBOT parent but at least it's not the whole internet you have to worry about.

thisisthetechie commented 4 years ago

Like I say, I'm not using MD5 any more, I've got a subset of a signed hash.

But MD5 or not, I'm not using the whole thing, just 8 characters from the entire string. This means (as I understand it) that a hacker would need to know which 8 characters I've used and then what the stripped values are that were dropped on the floor when the ID was generated. It's quite literally a random string that is generated based on a set of rules so is repeatable on creation.

So, for instance, I'm generating a signature (or MD5 hash) that could be "lY7oNZyTloUd0mzQYQVrFULfbb2nDDzXe" and capturing X consecutive characters within that, such as "loUd0mzQY". To reverse the ID into useable data, you would need to somehow be able to generate the characters stripped before and after the published ID and then execute your algorithm of choice.

The reason for the hash was simply to generate an ID. It's not being reversed anywhere. And now I'm using a SHA cert to generate my signature, which I'm then extracting X from within the result to provide my ID.

Does that make more sense?

zacbraddy commented 4 years ago

Ahh righto, yeah I think that puts my mind somewhat at ease. Ok well that's fine yeah I can make a column to do that. Not sure of how quick that will go through but in the mean time you can just let the parents know the first 5 characters of the current id as they will be unique enough and you will be able to work this out just by using the human readable names in the admin side.

But yeah I can get working towards this when I next get a chance to work on the app.

zacbraddy commented 4 years ago

@NOwen1stNut and @thisisthetechie check out the dev environment now and you should be able to play with the manually inputted Id that you've requested in this issue. Let me know if it suits and if it does then I'll push it to prod.

Just need to be mindful that until you've gone through and added all the ids for the kids that the front end will show blank as we currently don't have the data inputted so when we go to prod with this change you'll have to jump on putting the ids to avoid the site looking a bit weird.