mojaloop / project

Repo to track product development issues for the Mojaloop project.
Other
22 stars 15 forks source link

Investigate how we might integrate HSM with Mojaloop Deployment #852

Closed lewisdaly closed 4 years ago

lewisdaly commented 4 years ago

Goal:

As an OSS Maintainer, I want to investigate how we might be able to support the use of HSM for on-premises or in-cloud Kubernetes deployments of Mojaloop so that we have an understanding of what needs to be done to support HSM within Mojaloop.

Support for HSM is an issue that I've heard a lot about from FSP stakeholders and fintechs between the PI-7 meeting and the DFS Lab hackathon, and I think it would be useful to have a preliminary investigation on what a potential implementer would need to do to support HSM.

The outputs of this task would likely be a new piece of documentation outlining a path forward, that could be used as a starting point by an implementer wishing to add HSM support to Mojaloop.

Feel free to discuss.

Tasks:

Acceptance Criteria:

Pull Requests:

Follow-up:

Dependencies:

Accountability:

zeeemz commented 4 years ago

Hey Lewis,

Let's discuss this on slack would love to assist you with this.

Best, Azeem Paysys Labs

godfreykutumela commented 4 years ago

Hello @elnyry and @zeeemz just checking if you have already started in tackling this one?I have pulled it into the current sprint so let me know if this is ok with you both.

zeemzz commented 4 years ago

Hello @elnyry & @godfreykutumela ,

We first need to figure out where do we want to place HSM in the Mojaloop eco-system.

So the first question that is to be asked is are we adding support of HSM @ Hub or not ? The HSM requirement actually comes in from most of the DFSPs with established security standards. It also comes in from regulatory regimes , Some regulators would force financial institutions to break their key management into N number of Key Custodians.

Mojaloop Hub expect DFSPs to do their key management and share certificates via authenticated channel(s) ( Email / MCM ) Portal to Hub.

It's assumed that DFSPs would adhere to PKI best practices defined here however one key point to note is the guidelines focuses on DFSPs which have limited resources available probably (MFIs) and it's assumed that they will use OpenSSL for key generation and would then safely store private keys ( The on-us of securing private keys lies with the participants ) .

OpenSSL is a widely used tool but it had been prone to widespread vulnerabilities offloading key management to a dedicated hardware gives FIs a sense of confidence. In context of mojaloop a scheme rule can be enforced to FIs where a transaction threshold reaches X amount the FIs should have a dedicated HSM for key management and crypto handling.

How key management works w.r.t HSM being used at FIs

HSM XOR's component 1 & 2 and create a Key using both components

The Key effectively here is ( Key = Component 1 XOR Component 2 ) , The HSM Encrypts "0000000000000000" using the generated Key and creates a Key Checksum Value.

The custodian's share the key components with receiving entity ( Could be hub / Payments scheme ) two seperate custodians then go to HSM and punch in components received . The receiving entity would validate Key checksum Value it should be same across both HSM since same key has been used to encrypt 0's

I am thinking through how this could be mapped to mojaloop environment and would be sharing some thought in couple of days.

P.S. HSM used here are Thales/Gemalto , But typically all PCI Compliant HSM defined here should work.

millerabel commented 4 years ago

Thanks for starting this discussion and indeed the focus is on where are all the places we need an HSM to prudentially secure key materials in a Mojaloop-enabled scheme?

I would raise the HSM requirement to FIPS 140-2 Level 3 for an HSM that sits in the hub (if we decide there is a need for an HSM there). And as was mentioned, the size and sophistication of the FSP will dictate what type of HSM they use. I would argue, however, that the SDK can package a lot of the complexity of using an HSM and so requiring it for all DFPs is not unreasonable.

A tamper-resistant PCI-card HSM may be sufficient for small MFIs. Level 3 requires identity-based authentication for any key operations in addition to physical auto-zeroizing if tampered with. If someone steals it, it's useless. (See e.g. https://safenet.gemalto.com/data-encryption/hardware-security-modules-hsms/pci-hsm/)

Just to add here @millerabel all major cloud providers Azure and AWS specifically to provide FIPS 140-2 Level 3 compliant HSM's working with major HSM vendors and also do facilitate key ceremories onsite at the data centre where needed so we are good on this regard.

millerabel commented 4 years ago

Most FIPS-compliant HSMs will include a three-part key generation or splitting and combining functions. You do not need to use an in-production HSM to generate key shares (in fact, you should not do that!). Any FIPS-compliant HSM can be used to generate the key shares which are then entered into the production HSMs of both parties using a controlled key loading process. Either party can generate the key shares—both parties load the shares into production HSMs.

In the following description, the BDK is a Base Diversification Key, a secret master key that is used by counter-parties at the root of a security zone to subsequently generate diversified symmetric keys from non-secret identifiers—a process that avoids the need to share many secret keys over the life the relationship, which is an expensive process. But you can use this key share process for any secret symmetric key that must be exchanged between two HSMs.

Here is a description of the process:

image

Two 128-bit random bit strings are generated in the HSM. It is important that these random bit strings be generated using a cryptographically strong random or pseudo-random bit string generator with sufficient available entropy. The two random bit strings represent the first two key shares. These values should be generated freshly for each secret key to be exchanged and not reused.

Once the first two random key shares are generated, the secret key is then combined with the two key shares, through two exclusive-or (XOR) operations, to produce the third key share:

KS3 = BDK ⊕ KS1 ⊕ KS2

The secret key is never exposed outside of the HSM that generates it and each HSM into which the generated key shares are loaded and recombined should have the resulting key's attributes set to "no export."

godfreykutumela commented 4 years ago

HSM as a Service deployment option is also available see my notes below:

Hardware security modules (HSM) have been the tried-and-true method of managing the encryption keys critical to securing payment information. Payment companies traditionally have purchased HSMs and deployed them on-premises. Some cloud providers have implemented cloud-based HSMs. Each of these solutions has some challenges in today’s world of multicloud environments and global footprints, including the following:

The cost and logistics of buying traditional HSMs can be prohibitive to the deployment of new products or in new markets ahead of a revenue stream for them. Cloud-based HSMs were originally designed for general-purpose use without necessary features for securing digital payments such as PIN encryption.

HSM as a Service is a new and innovative approach to managing encryption and tokenization of data is HSM as a Service, which provides HSM-grade security without the need for hardware. HSM as a Service is ideal for the multicloud e-commerce and payment environments of today. It incorporates capabilities, features and functions designed specifically to address the needs of secure, high-volume, real-time digital payments in cloud environments. HSM as a Service addresses these needs in the following ways:

  1. Designed for multicloud environments

As a cloud-neutral service, it is quickly implemented and scales to meet the dynamic and cyclic payment processing demands across leading cloud environments such as AWS, Azure, Google, IBM and Oracle.

Supports Bring Your Own Key (BYOK) so the same key can be used across multiple clouds, ensuring only authorized users can access encrypted keys.

Available globally and can be co-located with other services for minimum latency and for storage of keys proximate to data across multiple cloud providers.

Provides options for alternative connectivity paths to ensure high availability of encryption key management services.

  1. Support the Highest level of data security

Built to ensure key material is never available in plaintext to any software component. Provides complete data tokenization control in support of real-time payment platforms, with no need for third-party services that are vulnerable to account information breaches.

Cloud-friendly APIs to develop new applications with secure handshaking, passing tokenized data to gain secure access in compliance with data sharing regulations such as PSD2. Keys that encrypt traffic between devices are never exposed in plaintext on the system memory bus or on any other physical interface.

Keys are maintained separately from encrypted data to provide the highest possible level of data security.

  1. Optimum performance of a cloud based deployment

Encryption keys can be located at the digital edge, close to the cloud, network providers, retailers and payment services to guarantee the fastest, most secure and lowest-latency interconnections.

In Summary

Supports data sovereignty as required by GDPR, securely and optimally maintaining encryption keys and encrypted data in the country where data is collected or created.

HSM as a Service is the ideal means of managing encryption keys and providing tokenization in complex environments that support e-commerce and secure digital payments. Cloud-neutrality, on-demand scalability and co-location at the edge where e-commerce, population centers and digital ecosystems meet help organizations simplify encryption key management without sacrificing security.

HSM as a Service can securely generate, store and use cryptographic keys and tokens vital to secure digital payments. While encryption key management in multicloud environment can be complicated, HSM as a Service combines simplicity and robust protection to ensure secure digital payments.

kjindani commented 4 years ago

Couple of points w.r.t. selection of On-Premises v/s On-cloud option and it's implication on developing support for HSM either in Hub or on DFSP end.

From a cost perspective, current pricing comparison of HSMs

Thales Payshield HSM Cost US$ 39,000 (including Year 1 Support) Annual Support (15%) - US$ 5,850 5 Year CAPEX + OPEX = 62,400

Amazon HSM on Cloud as service = $1.45 per hour
FIPS 140-2 Level 3 compliant HSM but model is unknown https://aws.amazon.com/cloudhsm/features/ 5 Year cost = 63,510

Azure Dedicated HSM - FIPS 140-2 Level 3 compliant = $4.85 per hour SafeNet Luna Network HSM 7 Model A790 cloud-based HSM 5 year cost = $212,430

Pre-requsite for HSM as service would be that

In terms of implementation, each brand of HSM has it's own command set so the implementation will need to have an abstraction layer and then depending on model of HSM the actual command set can be implemented.

bukasaaime commented 4 years ago

Good day all, Please see my contribution based on the Mojaloop overall architecture and on my own experience implementing Cryptographic Projects in the banking sector.

Implementing an Hardware Security Module (HSM) is regulatory issue. Often mandated by the Payment Card Industry (HSM) to protect cryptographic keys in general and PIN or CVV/CVC generation / verification keys in particular. An HSM is a temper-resistant subsystem where Cryptographic Keys to be protected a kept inside the HSM registers. In many of the implementations I have worked with only one key is kept inside the HSM. The Master Key. For added security, The Master Key is randomly generated from the HSM random generator function. An Master Key encrypted Key Hierarchy is then established (see figure 1).

image

Figure 1

The Key Hierarchy will comprise Key Encrypting Keys and Operational keys Or Data Keys both Public Key Infrastructure (PKI) and Symmetric or session Keys. The Key Hierarchy will remain outside of the HSM on a Hard Drive due to space constraint. An HSM need to support both PKCS and FIPS levels. Both the Mojaloop Hub and FSPs need to implement and manage a cryptographic subsystem. Mojaloop developers will need to be shielded from the complexities of the cryptographic subsystems:

  1. HSM management
  2. Cryptographic keys management
  3. Complex HSM APIs and parameters
  4. Variety of HSM Cryptographic providers

A Cryptographic Provider in this case would either be for instance a cloud HSM environment (AWS, Azure or others) or a discrete HSM provider such as the IBM Common Cryptographic Architecture provided by the IBM 4764 cryptographic coprocessor or a Thales APDU commands provided RG 9000 or an nCipher HSM. An HSM needs to be implemented in the context of the overall Mojaloop Security Architecture, not in isolation. Hence I would suggest we create a simplified Mojaloop Cryptographic API or microservices that will abstract the above complexities of the cryptographic subsystem. We also need to take into account the overall Mojaloop architecture which mainly implement a publish/subscribe event bus, Kafka in this case. The Mojaloop crypto api will also need to be implemented in the context of the Public Key Infrastructure (PKI) where symmetric keys are not manually exchanged but rather generated as session keys and exchanged in real-time.
In choosing the Crypto Provider, we also need to take into account the latency issues should the crypto processes be located in a geographic location that is remotely located from the application implementing crypto processes. We will need to take into account certificate generation and Certification Authorities (public or private) processes and Key Management System for Symmetric keys. Mojaloop relies on Kafka, therefore on a publish-subscribe architecture which has an impact on how a cryptographic architecture is implemented (see figure 2).

  1. SSL termination
  2. Authentication of FSPs crypto calls where the FSP ID and secret keys would have been obtained through a registration process.
  3. Encryption and Decryption of data protected by session keys
  4. And other cryptographic processes such as Hash or Mac and others yet to be defined.

image

Figure 2

millerabel commented 4 years ago

DFSPs will not need a full-sized Thales HSM. They can utilize a PCMCIA card HSM like the Thales Luna PCIe. These devices range in performance from the entry-level A700 at 1,000 RSA operations per second / 2,000 AES OPS to over 10,000 RSA OPS / 17,000 OPS for the A790. And the A700 costs less than $8,000.

There is also a smaller version, the Luna PCM. It is packaged as a PCMCIA card. It costs less than $3,000.

So it is not necessary to invest in Ethernet-grade Luna SA or equivalent HSM.

The design suggested above puts almost no load on the HSM. Only wrapping keys and session keys are generated in the HSM with all operational crypto operations performed in software. However, the PCM HSM is sufficiently performant, it would actually accelerate crypto operations for a typical server box, and perform most application level crypto directly inside the HSM.

It's important to remember that unit deployment cost must be minimized by our design choices. The design must be realistically sized, of course, but we don't need to start with the most expensive hardware or cloud services. The TCO of a Luna PCM will be a fraction of these larger Ethernet-connected boxes. And suitable for use by DFSPs for Interledger fulfillment generation, VPN root cert, JWS keys, ... And since the cards fit in a standard server, the solution can be packaged for easy deployment by small DFSPs with limited data center capabilities.

godfreykutumela commented 4 years ago

This is just to summary the HSM design input provided over the last 2 weeks and I will setup a session for next week to delibarate more on this and adopt one or two approaches for a POC exercise. Mojaloop HSM Positioning Document - 20 March 2020.docx

godfreykutumela commented 4 years ago

Hi All,

Please find a use case from Applied Payment for your review ahead of tomorrow weekly stand-up.

Best Regards,

Godfrey Mojaloop HSM Integration v0.7-RP Review.docx

bukasaaime commented 4 years ago

I have looked at the OTP use case in the context of the overall Mojaloop security architecture and here are some recommendations with regards to the proposal put forward by Applied Payment Technology.

These remarks or recommendations can also be used for other use case involving function calls to the crypto subsystem (HSM and related crypto APIs or commands).

This might seem complex but a lot of simplification has been suggested by introducing a multi-layered message schema and an API economy approach.

  1. This is an integration exercise.
  2. Major EFT switches have been modernized and come with an API gateway and an SDK.
  3. In the case of Postilion, ACI has introduced the Universal Payment (UP) Framework with API manager and an SDK.
  4. What we need is to define is a multi-layered Mojaloop API or message schema that will be constituted by an https, client-authentication and payload layers.
  5. Authentication layer will comprise the client ID and secret key obtained during Client (FSP) registration with WSO2 API manager or a custom Mojaloop registration process.
  6. Mojaloop hub will be both an Authorization and Resource provider at the same time in a B2B architecture.
  7. The payload will carry the EFT switch-translated PIN block from ATM Zone Pin Key (ZPK) or POS DUKPT encrypted PIN Block into https session key encrypted PIN Block.
  8. No Terminal Mater Key (TMK) or PIN Encrypting Key (PEK) used in the process.
  9. The payload will also carry the MSISDN and other parameters such as the MAC or the signature.
  10. This will decouple the Mojaloop crypto processes from the EFT Switch crypto processes. No need for an additional HSM other than the one(s) used by the EFT switch.
  11. I would expect the switch to implement Mojaloop schema in the message to Mojaloo hub.
  12. The PIN or OTP is generated by the Payer FSP. This means that the PIN generation Key will be defined by the Payer FSP Key Management Processes.
  13. Mojaloop needs to understand REST api. Here again, schema will apply. Mojaloop SDK to be used.
  14. https layer will be pealed out by SSL termination in Mojaloop hub HSM. Perhaps implement https 1.3 with elliptic curve, Advanced Encryption Standard (AES) instead of Triple DES, (Triple Data Encryption Standard) and HASHMAC implemented. Https 1.3 is much more modern and much more secure than its predecessors https 1.1 and https 1.2.
  15. No need to worry about POS DUKPT and ATM, ZPK or TMK keys since the switch would carry out the translation from a ZPK or DUKPT key into a https session key and the API gateway will prepare the Molojaloop specific OTP message.
  16. At the Mojaloop hub HSM, the PIN translation will be carried out from session key 1 (EFT switch to hub) into session key 2 (Mojaloop hub to payer FSP).
  17. The payer FSP need to consume Mojaloop OTP message.
  18. Payer FSP has the PIN Verification Key, since OTP was generated by payer FSP.
  19. Mojaloop API schema or message need to be generic to accommodate other use cases.
  20. Each use case need to cater for Client authentication (verifying client id before executing the rest of the routine). Client ID and secret key obtained during Client registration.
  21. We need to implement a Key Management system on the Mojaloop hub for cryptographic key definitions and HSM management.
  22. The crypto provider needs to be abstracted from the REST API
  23. KMS needs to be abstracted as well. Both symmetric and asymmetric keys need to be managed by the KMS
  24. A decision need to be made in terms of two-way SSL between Mojaloop hub and its various clients. This is more secure but add in complexity since client certificates and private Certification authority will need to be introduced and managed by the Mojaloop hub.
  25. LSP adapter to be replaced by or merged into the main Mojaloop API gateway because EFT Switches are no longer legacy systems. They have been modernised and are advanced systems that can define APIs and related custom message schemas.
  26. Define a separate Crypto API that will be called by all other Mojaloop processes such OTP, Authentication, etc … and the main Mojaloop API gateway.
  27. Payer FSP to import OTP using Mojaloop hub to Payer FSP session key.
  28. Payer FSP to verify OTP using the PIN generation Key protected by an HSM.
godfreykutumela commented 4 years ago

Thanks @bukasaaime for the input and I will forward to Max and Renjith to review ahead of tomorrow's round up discussion on this.

bukasaaime commented 4 years ago

Thank you @godfreykutumela . Much appreciated.