Closed chadwhitacre closed 8 years ago
I think we should host our vault directly on AWS, since they clearly offer a PCI compliant environment, whereas Heroku doesn't advertise as much.
I'm envisioning a very simple key/value store, an expansion of vault.py to put it on the network. I suppose the thing to do would be to use HTTP so we can post into it from javascript. We don't want to transmit sensitive data through the main web app at all.
Data Encryption
In addition to being able to store secrets, Vault can be used to encrypt/decrypt data that is stored elsewhere. The primary use of this is to allow applications to encrypt their data while still storing it in the primary data store.
The benefit of this is that developers do not need to worry about how to properly encrypt data. The responsibility of encryption is on Vault and the security team managing it, and developers just encrypt/decrypt data as needed.
One key feature of our requirements here is that the web app only needs to write secrets, not read them. It's the payroll process that needs to read secrets, in order to originate ACH credits and populate invoices. My thought is that we should use public key cryptography, with the web app holding the public key (via heroku config:set
) and the payroll process having access to the private key.
Introducing a server component, whether Vault or something else (including something DIY), increases our surface area and level of complexity significantly moreso than integrating encryption-before-storage into our existing application architecture. What are the PCI implications of the latter?
Another design requirement: I want separate access groups for the main web app and the PCI vault. I want to be able to grant access to Heroku (app + db) as we've been doing, which is carefully, to be sure ... but we need to be even more careful with access to vaulted data.
Let's distinguish the three pieces of information we're intending to collect, their risk profile, and our immediate application requirements regarding each.
piece of information | risk | write | read—process, role, purpose |
---|---|---|---|
bank account number (BAN) | financial theft | web | payroll, Gratipay, generation of NACHA files to submit for ACH origination |
individual national identification number (NIN) | personal identity theft | web | web, team owners, filling out tax forms |
business identification number (VAT/EIN) | business identity theft | web | web, supporters and team owners, generation of invoices |
So the web app does need to read some secrets.
Meaning it does come under the systems we need to consider in terms of PCI compliance.
The requirement for invoices is that VAT be available to both supporters (buyers; https://github.com/gratipay/gratipay.com/issues/1199#issuecomment-24576143) and teams (sellers; https://github.com/gratipay/gratipay.com/issues/1199#issuecomment-67562705).
Hashi Vault supports dynamic secrets. Could we use that to ensure that access to Heroku doesn't entail access to our vault?
Dynamic Secrets: Vault can generate secrets on-demand for some systems, such as AWS or SQL databases. For example, when an application needs to access an S3 bucket, it asks Vault for credentials, and Vault will generate an AWS keypair with valid permissions on demand. After creating these dynamic secrets, Vault will also automatically revoke them after the lease is up.
Like, when the app spins up, it asks our vault for credentials to our vault?
Looks like that would take some work.
I'm going through the Vault intro.
Alright, I am introduced to Vault. It's a nice piece of software. We very well may be able to use it here.
I want to give people access to a web app (at Heroku, as it happens) that has access to Vault, without giving the people the same access to Vault as the web app has. This could be achieved with a vault secret backend that supported dynamic secrets, yes?
I've registered for an AWS account.
Can we use the browser as the go-between to avoid leaking vault access to people with Heroku access?
I don't see how to meet this requirement with Vault. :(
Or at all, really. If the web app has to be able to write, then whoever has access to the web app could potentially write out their bank account details and collect all of payroll for a week.
Okay, so let's take it that we don't have a separate access tier that is even tighter than access to our production hosting environment and database.
Then we're back up against the fact that Heroku does not promise a PCI-compliant environment to nearly the extent that Amazon does.
Gosh. Are we talking about migrating away from Heroku? :mouse:
Are your datacenters certified / PCI compliant?
All of our datacenters have been certified by national and/or international security standards.
Our NYC1 facility is SSAE16 SOC-1 Type II certified. Our NYC2 facility is SSAE16 SOC-2 Type II certified. Our NYC3 facility is SSAE16 SOC-2 and SOC-3 compliant. Our AMS1 and AMS2 facilities are ISO27001:2005 and ISO9001 certified. Our AMS3 facility is ISO9001, ISO27001, and SSAE16 Type II certified Our SFO1 facility is SSAE16 SOC-1 Type II certified. Our SGP1 facility is ISO27001:2005 certified. Our LON1 facility is ISO9001:2008, ISO27001, and SSAE16 / ISAE 3402 certified. Our FRA1 facility is ISO9001:2008, ISO27001:2005, and ISO22301:2012 certified.
https://www.digitalocean.com/help/policy/
via https://www.digitalocean.com/community/questions/digital-ocean-pci-dss-server-compliance
Amazon > DO > Heroku (PCI-wise)
Okay! Reticketed as #3505. :swimmer:
Well, reopening because it still might make sense to separate out the vault from the main db.
See https://github.com/gratipay/inside.gratipay.com/issues/223 for overarching discussion about infosec risk management.
Here's the page listing Vault storage backends. Looks like the only real option for us is Consul. They recommend 3-5 nodes per data center, and common practice with AWS is to run with at least two data centers (availability zones [AZ] in AWS-lingo).
Could we get away with one node and an EBS volume?
"AWS Tips I Wish I'd Known Before I Started"
A collection of random tips for Amazon Web Services (AWS) that I wish I'd been told a few years ago, based on what I've learned by building and deploying various applications on AWS.
How about two AZs, with one EC2 instance each, running both Vault and Consul + one EBS volume?
I'd really really like to outsource the task of private information management to trusted parties, which could be security and privacy guarantees of our clients.
@techtonik Please make a concrete suggestion for how to do that in our case.
Ok. The suggestion is that we only have to keep records like this:
Then we need to query the bank requisites through OAuth or give the privacyservice1 a command "do transaction with ... on behalf of our client that is recorded as privacyservice1:id on your system".
I'd really avoid storing all the private info on GP, because then it will become an easy attack vector automatically.
@techtonik What are some examples of companies we could use for privacyservice1
?
@whit537 I think that lawyers should know about privacy services, so they should be aware for privacy services in banking industry as well.
@techtonik That's a cop out. Go find us some privacy services to talk to if you want us to pursue that possibility.
@whit537 we do have a lawyer that we pay, don't we?
@techtonik That's a cop out. Go find us some privacy services to talk to if you want us to pursue that possibility.
@whit537 is it possible to ask him directly first? I don't even know how these services are called in English, leave alone US specific terms.
@techtonik Can you link us to one example of such a service? Doesn't have to be in English, I just have no idea what you're talking about right now. :-(
@whit537 I don't know. Some advanced banks have services that help to preserve user privacy. They can issue anonymous or one-time debit card, for example, and may exhibit a OAuth management API that can also hide the identity. This decreases the risk that user data will be stolen and reused by malicious party. Quick search - https://www.privacyworld.com/5mastercard.html
@techtonik I'm not sure what to do with that side. In any case, we were rejected by Citizens (#3366), so we're no longer trying to build our own direct ACH integration and we don't need to store bank numbers. We're going to try Zipmark instead (#3491). Closing ... for now(?!).
Zipmark didn't work out, and it turns out we want to do strong idv for employment reasons, not just AML reasons.
Blog post on Balanced's architecture:
http://blog.balancedpayments.com/balanceds-architecture/
knox
,midlr
, andjs
are all on their own Amazon account. Only a subset of our staff has access to this: I personally wouldn’t even know how to get into those servers.precog
,api
, androuter
are all on an Amazon account which most of our developers have access to, and that’s where most of the actual work in building new features goes.
"best practices for storing ssn"
I would look towards HIPAA de-identification guidelines on protected data from HHS.
https://www.reddit.com/r/AskNetsec/comments/2pswf5/securely_storing_ssn_details/cmzyxno
Haven't considered HIPAA before.
http://www.heinz.cmu.edu/~acquisti/ssnstudy/ On Nov 28, 2015 12:55 AM, "Chad Whitacre" notifications@github.com wrote:
"best practices for storing ssn https://www.google.com/search?q=best+practices+for+storing+ssn]
- http://stackoverflow.com/questions/254935/storing-social-security-numbers
https://community.spiceworks.com/topic/739855-best-practices-storing-personal-data-including-ssns
- http://www.sqlservercentral.com/Forums/Topic1572097-373-1.aspx
https://www.socialsecurity.gov/kc/id_practices_best.htm
http://security.stackexchange.com/questions/9403/when-storing-private-identifying-information-in-a-web-application-what-is-indu
http://discuss.fogcreek.com/joelonsoftware1/default.asp?cmd=show&ixPost=32808
https://www.ssa.gov/phila/ProtectingSSNs.htm
http://passcouncil.washington.edu/site/files/uw_ssn_best_practices.pdf
http://passcouncil.washington.edu/site/files/uw_ssn_guide.pdf
https://www.reddit.com/r/AskNetsec/comments/2pswf5/securely_storing_ssn_details/
— Reply to this email directly or view it on GitHub https://github.com/gratipay/gratipay.com/issues/3504#issuecomment-160251283 .
We are going to start storing national identification numbers (https://github.com/gratipay/gratipay.com/issues/3289#issuecomment-107100341) as well as bank account numbers (#3377 downstream of #3366). We need a vault separate from our main application and database that is more highly secure. We should use the PCI DSS 3.0 standard to self-assess the security of our application (https://github.com/gratipay/inside.gratipay.com/issues/214). This ticket is about building a new vault component of our architecture.
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.