networknt / light-eventuate-4j

An eventual consistency framework based on Event Sourcing and CQRS on top of light-4j and Kafka
Apache License 2.0
59 stars 20 forks source link

How to control which service and subscribe one particular event type #20

Open stevehu opened 7 years ago

stevehu commented 7 years ago

For security reason in banking application and this is actually a bigger topic as how to control Kafka access from all the services.

archenroot commented 6 years ago

https://developer.ibm.com/opentech/2017/05/31/kafka-acls-in-practice/

Still, you are using Zookeeper, so I think this could be the core point for config to be distributed with both services and kafka broker instances. So the security is built-in the core config component.

GavinChenYan commented 6 years ago

Framework use the annotation to control the service to subscribe particular event type. for example in our sample:

@EventEntity(entity="com.networknt.eventuate.account.command.account.Account") public interface AccountEvent extends Event { }

The entity is the aggregation type. Account event handler will subscribe and process the events in the topic only.

archenroot commented 6 years ago

:-)) this is nice, but... I think in banking environment it is not about preventing someone to try connect, but about preventing the system from those who are not authorized to connect to.

This code of yours is "just" a mapping between object name and source name (where the object is available), but this doesn't prevents me from subscribe from command line to topic. And this is probably the core requirement: RESTRICT ACCESS TO TOPICS TO AUTHORIZED ONLY SERVICES/CLIENTS

I can imagine that there are available sometimes sensitive data. Another story is to encrypt the data before submission into kafka distributed world.

I just finished AMQP based project here within Energy market data and actually we implemented highly secure messaging platform which uses PKI. This has of course side effect that the message can be only decrypted by single peer in network. If you would like to send same message to multiple subscribers, you need to do this 4 times (each encrypted with specific key). But in our case this overhead is not a problem as we don't have millions of messages per minute...

Other solution to managing public/private keys is not to generate them per instance/service, but make a link between event and key, so you then only distribute the key with services which are authorized to receive such event. Again you will need management UI as well where you manage event <-> key mapping, revocation, etc. But I think doable.

Access Restriction vs Data Encryption With Data Encryption you might consider no need to control access to kafka topics as if the service/client is not on white-list or completely unmanaged, it won't have key available, so cannot decrypt data. AES instruction is built-in into todays CPU, so you will need to built in such case a encryption/decryption cluster ring or adapters into each service.

Byt it depends on other security factors, I think in the end you will need possibly mixture of ACL and encryption depending on nature of data.

NOTE: i understand that in Banking domain it might not be an option or something not considered by your project, but I worked on AES implementation on GPU chips. These beasts (Tesla V100, etc.) can encrypt/decrypt gigabytes/s from/to kafka streams.... https://github.com/archenroot/AES-Cuda And you can implement REST API directly in CUDA, so you call endpoint directly in GPU chip to encrypt/decrypt data.

GavinChenYan commented 6 years ago

Thanks for the comments.

For the Access Restriction to the event store(Kafka), We don't have access control in the framework. I think it will be controlled by bank domain security setting.

For the Data Encryption, thanks for the suggestion, we will take a look and will try to implement it for next release..

archenroot commented 6 years ago

I mean I really don't need such feature in the moment, I also think it is quite lot of work, I just provided comments to requirement by Steve as possible suggestions.

Regarding the bank security internal system, still it will be good maybe to think about common interface to control this and either it will be finally manual or integrated via API work... but requires definitely more brainstorming....

archenroot commented 6 years ago

@stevehu @chenyan71

Hi guys, I was thinking about security of the data within the service -> broker -> service. So I described already the concept of ACL on Kafka topics as it is supported.

But if you would like to introduce higher level security, as I suggested there is posible use of PKI infrastructure, but not to secure user -> user communication, but service -> service comunication. Not only that one service can have multiple instances and access to key, but also you can have multiple services able to read the message (shared key), while this in standard PKI creates a flaw, from my perspective it is completely regular usage. I describe here the approach how to keep the platform highly secure.

Lets start over: We have some manager (managed by bank security operators) who manages 3 entity mapping - Event -> Key -> Service(s), following initial state: image

Now imagine that you make Subscriber B removed from trusted group, what will happen is that new key is generated and automatically distributed to service cluster. You still need in this moment the old key being available as there could be consumers who need time to consume (you can leave the obsolete key living in the system for event for long period of time, key will become automatically useless by design). Once the new key is distributed, the producer start new data encrypt by this key (of course you can distribute with the key also TIME trigger, so you schedule service key auto-switch), so you end up in following situation: image

Consumer B won't be able anymore consume messages by newly distributed key to producers. Only Consumer A will be able to understand.

The Key Manager can at the same moment distribute ACL lists.

By application of this approach you are going to distribute BINARY data trough Kafka. Or you still go with JSon, but you should encode using Base64, Base122 or Base128 the binary data.

Offtopic: Another topic could be (as an extension), how to send via Kafka LARGE BINARY (ENCRYPTED) STREAMS and I have quite nice solution to this as well.

stevehu commented 6 years ago

@archenroot Thanks for all the great ideas. This is just like a brain storm session on the internet. As you know we have light-oauth2 which is an OAuth 2.0 provider but with a lot of enhancements to support microservices. One of the feature is public key certificate distribution API to allow different services to go to central location to get public certificate with a kid. It is the foundation for data confidentiality and integrity.

To protect data flow in Kafka or any message broker, we need to ensure that the data can be encrypted if necessary and we need to ensure that data can be signed so that it is not modified along the way.

Both can be done with PKI as you have mentioned but encryption is a little bit different due performance issue. We are planning to implement something like TLS protocol to exchange the public key cert in a handshake and use an symmetric key to encrypt data. TLS is used to security the channel but for us we are using it to secure the payload only. This is very useful in our light-hybrid-4j framework as we are not using anything in http but only the body stream for communication.

I am very interested in these discussions but sometimes cannot respond in a timely fashion. Last week we had three days training session for designer and couldn't do anything but focus on that. This is a beauty of forum like discussion, it can last several days or weeks:) Let the good idea flow...

archenroot commented 6 years ago

One of the feature is public key certificate distribution API to allow different services to go to central location to get public certificate with a kid.

Nice, so the key building block already exists...

Both can be done with PKI as you have mentioned but encryption is a little bit different due performance issue. We are planning to implement something like TLS protocol to exchange the public key cert in a handshake and use an symmetric key to encrypt data.

Totally agree, that is why even when channel is only encrypted we use TLS interceptors frontend from where the channel doesn't need to be encrypted. I understand, you use PKI to encrypt deivery of something like AES-256 CBC keys, which application will be much faster by default, on top you have hardware support on the CPU via hardware instruction set: https://software.intel.com/sites/default/files/article/165683/aes-wp-2012-09-22-v01.pdf

Yes, that's it.

Really thanks for this discussion here guys, we put light on many aspects!

stevehu commented 6 years ago

special instruction set on Intel or AMD CPUs as well as GPU encryption should be considered with plugin modules and injected as per instance configuration with service module (our in-house IoC). Yes. we've touched a lot of areas and these need to be address in mid-term.