danielberkompas / cloak

Elixir encryption library designed for Ecto
MIT License
582 stars 55 forks source link

Make Cloak.Vault a GenServer #75

Closed danielberkompas closed 6 years ago

danielberkompas commented 6 years ago

Right now, the init/1 callback is called each time for encrypt/decrypt. This might be slow, depending on the logic in init/1.

Spec

Make Cloak.Vaults into GenServers that must be added to your supervision tree. On startup, they store their configuration in a public ETS table.

The encrypt and decrypt functions would read configuration from this ETS table and perform their work in the client process. The code_change callback would trigger a config reload.

ulissesalmeida commented 5 years ago

Hello,

Fist, thank you for this awesome library. It has good defaults and I'm really happy.

However, the only thing that bothers me, is the need of the GenServer. I was wonder why not read the configuration directly from the Application.gent_env, since it's backed by an ETS table too. It was a little bit counterintuitive for me has to start a supervised process to call an encrypt function.

What do you think about it ? Why it is important to have an additional state management for the configuration. Why Application.get_env is not good enough ?

danielberkompas commented 5 years ago

This was done for several reasons.

  1. Performance. Application.get_env is significantly slower than reading configuration from an ETS table (at least in my testing). Cloak's encrypt/decrypt functions will get called very often in an app with a lot of encrypted fields, so performance is important. The Genserver starts and monitors an ETS table where it stores the configuration.

  2. More flexibility for configuration. Not everyone wants to put all their configuration in mix config files at compile time. With a Genserver, they also get a callback at runtime (init/1) which they can use to return their preferred configuration.

If you look at the implementation, you will see that the Vault.encrypt and Vault.decrypt functions execute in your current process, not the Genserver process, so there is no bottleneck. They simply read from the ETS table that is managed by the Genserver.

ulissesalmeida commented 5 years ago

@danielberkompas

Thank you so much for your answer! It was awesome.

I did also my benchmarks, and I have some ideas, and I would like to hear about your thoughts.

You can access mine benchmarks results here: https://gist.github.com/ulissesalmeida/b260d9330f368995b63d7635a4561161

  1. Performance

In my benchmark results, if we turn off parallelism, read from ETS table is 2 times faster than from Application.get_env. However, when we turning on parallelism, the current implementation vs read from ETS is kind the same. The option that outshines is ETS with reading concurrency turned on.

My benchmark was based on exs file, not a compiled for production binary, then, the results can be different.

But there's one option that would be faster, and we need any benchmark for that. It would make the configuration options be the encrypt function argument. For example:

config = Application.get_env(:my_app, :my_vault_config)
for i <- 1..1000 do
  Cloak.encrypt(i, config)
end

☝️ Here, for batch encryption, (that was my case), read from the configuration once and use that same config many times, it would be faster than read from any ETS or from Application.gent_env

  1. Flexibility for configuration

I liked you carefully thought about your library users in providing a way to have run time configuration. But I wonder, provide configuration as a function argument in the encrypt/decrypt functions would not be the definitive flexibility?

This library is being split between Ecto parts and pure Encryption part. Then, I believe, this core library can provide functions(no process, or application configuration) to encrypt/decrypt, like Erlang :crypto does.

Example

Let's make Cloak make encrypt/decrypt functions public with the second argument as config.

defmodule Cloak do
  def encrypt(plain_text, config)
  def decrypt(plain_text, config)
end

This way, I could use Cloak without worry about Application or GenServer, I could use as a function. This approach is also recommended by Elixir library guidelines: https://hexdocs.pm/elixir/library-guidelines.html#avoid-application-configuration

  1. The library can still deliver advanced configurations options

I'm not saying we should remove the current approach, but actually, offer more options of usage and reduce the entry level barrier.

# Quick use
Cloak.encrypt(
  "hello world", 
   encrypt_module: Cloak.Ciphers.AES.GCM, 
   key: Base.decode64!("3Jnb0hZiHIzHTOih7t2cTEPEpY98Tu1wvQkPfq/XwqE="), 
   tag: "V1"
)

Then, we can still offer the Cloak.Vault macros.

  1. Application.get_env vs GenServer

Today, you can put some config for the Vault in the Application. However, the GenServer copies that to its internal ETS table. What's happens if I need to change the env, many users will try to do something like Application.put_env and it will not work. It happens because the encrypt configuration has 2 sources of true.

My Suggestions

  1. Turn on the read_concurrent config of the Vault ETS table.
  2. If the developer provides Application configuration, the Vault should not have ETS table, keep the Application as the source of the true. It's slower, but it's more predictable.
  3. Provide public functions to encrypt in Cloak, use the config as a parameter.

What do you think? I can help you with that if you like my ideas. I also would like to hear your thoughts. If it makes sense or not.