Closed joshuasing closed 4 months ago
Additionally, it appears that IP addresses are being stored alongside email addresses. What is the purpose of doing this? This is creating a link between two pieces of PII which could be abused by a potential attacker if the database were to be compromised.
I think at some point we wanted to have that information for the future, for marketing purposes, and we needed a place where to save and cross that information. Originally we would save it in Hubspot, then I think we switched to the database and here we are. I think @rmilrad may have more details about it as he's the one who laid down the original requests
As of the IPs, yes, we can hash them. There's also this related issue https://github.com/BVM-priv/ui-monorepo/issues/183
Yes, agree we should be hashing IPs in order to protect user data. @joshuasing @gndelia how are we currently checking email addresses for repeat Capsule abuse? Are we querying from Hubspot or are the emails also stored in the database? (cc @jcvernaleo)
@rmilrad Emails are stored in the db, together with IPs (For these marketing requests). There's also an "ip accesses" table in which an IP is stored for 7 days, and then removed, to allow using an IP at most once every 7 days, as discussed. Once an email is stored, it's kept for the time being (We may solve when to clear that in #183)
We're storing this in a db because the idea wasn't to query Hubspot
Yeah, I can see some issues arising from storing email in such a manner. I was under the impression that the emails were stored in Hubspot.
@jcvernaleo @joshuasing any thoughts here?
(cc @matt-lam)
It appears that
claim-tokens
currently stores IP addresses as plaintext in the database. I believe that doing this is unnecessary and may lead to compliance issues in the future.It would be safer to hash the IP addresses before storing them in the database, thus making it hard to get the original IP address, whilst making it still possible to do comparisons and prevent abuse.
Using a HMAC, such as HMAC SHA-256 would also help to prevent attackers from using rainbow tables to reverse the data. HMACs require a secret key, which must be stored securely and separately from the data.
A very basic example implementation of this would be:
This
hmac
function could then be called with a secret key (likely provided via an environment variable) and the IP address or any other sensitive data. A base64-encoded string will be returned, which can then be safely stored in the database.Additionally, it appears that IP addresses are being stored alongside email addresses. What is the purpose of doing this? This is creating a link between two pieces of PII which could be abused by a potential attacker if the database were to be compromised.