devshawn / kafka-gitops

🚀Manage Apache Kafka topics and generate ACLs through a desired state file.
https://devshawn.github.io/kafka-gitops
Apache License 2.0
317 stars 71 forks source link

Support for templating? #46

Open erikhh opened 3 years ago

erikhh commented 3 years ago

We run a multi tenant system. In general environments are the same except for the name of the tenant. To keep hosting overhead down we'd like to have all our tenants share one Kafka cluster. Where we keep the tenants apart using ACLs.

To do this now with Kafka-Gitops we would need to have a lot of repetition in our state file, and thus a lot of messy copy- paste-adapting every time we need to connect a new tentnat to our system. Or when we want to introduce a new application service for all our tenants.

In Terraform you can have variables and such, and generate resources for each item in a list. We've set it up so that we can just add the Tenant to a list and from there on Terraform creates all the resources needed for that tenant. Are there any plans to add this kind of templating to Kafka-Gitops as well?

Maybe even just a way to share a custom ACL between multiple services or users without having to repeat the whole ACL for each and every one. That would already make the state file a whole lot more concise for us.

devshawn commented 3 years ago

Hi @erikhh, apologies for the late response. It's a busy time of year for me. :)

I think this is a great idea, and we've discussed some similar features in some of the other open feature requests. I think some sort of templating would be helpful. Do you have specific examples of the types of templating that would benefit you the most?

Maybe even just a way to share a custom ACL between multiple services or users without having to repeat the whole ACL for each and every one. That would already make the state file a whole lot more concise for us.

I like this idea -- agreed. I'll have to think about what the format of this would look like, but I think this is something that could be added in the near future.

erikhh commented 3 years ago

Hi @devshawn, I know right?! Same here, sorry for the late response.

For now I just had to get 'my thing' here going. So I went ahead and added a bit of Mustashe to at least get my principal based repetition out of the way. It ain't exactly pretty but it's functional. I made our CI pipeline take a Mustasche template of the state.yaml and apply the correct variables for the environment it's building for to produce a full fledged state.yaml file that's published to a S3 bucket to be picked up.

Based on my experience with kafka-gitops so far it would already be really good (for me anyway) to DRY up the way you can add and assign custom service acls. There's a lot of repetition in using those. For example this is one of my entries now.

  {{.}}_journal-extractor:
    create-tenant-journal:
      name: journal-{{.}}
      type: TOPIC
      pattern: LITERAL
      host: '*'
      principal: User:{{.}}_journal-extractor
      operation: CREATE
      permission: ALLOW
    describe-tenant-journal:
      name: journal-{{.}}
      type: TOPIC
      pattern: LITERAL
      host: '*'
      principal: User:{{.}}_journal-extractor
      operation: DESCRIBE
      permission: ALLOW
    write-tenant-journal:
      name: journal-{{.}}
      type: TOPIC
      pattern: LITERAL
      host: '*'
      principal: User:{{.}}_journal-extractor
      operation: WRITE
      permission: ALLOW
    create-ship-journal:
      name: journal-{{.}}-ship-incoming-
      type: TOPIC
      pattern: PREFIXED
      host: '*'
      principal: User:{{.}}_journal-extractor
      operation: CREATE
      permission: ALLOW
    describe-ship-journal:
      name: journal-{{.}}-ship-incoming-
      type: TOPIC
      pattern: PREFIXED
      host: '*'
      principal: User:{{.}}_journal-extractor
      operation: DESCRIBE
      permission: ALLOW
    write-ship-journal:
      name: journal-{{.}}-ship-incoming-
      type: TOPIC
      pattern: PREFIXED
      host: '*'
      principal: User:{{.}}_journal-extractor
      operation: WRITE
      permission: ALLOW

First of all for some reason I had to repeat the principal over and over again, even tough it's also specified with the service already. But this might also be, just a bug.

It would already save a lot of repetition if you could pass more than one operation, maybe by accepting a list there in stead of a single value.

That could already reduce the above to something like:

{{.}}_journal-extractor:
  tenant-journal:
    name: journal-{{.}}
    type: TOPIC
    pattern: LITERAL
    host: '*'
    principal: User:{{.}}_journal-extractor
    operation: CREATE, DESCRIBE, WRITE
    permission: ALLOW
  ship-journal:
    name: journal-{{.}}-ship-incoming-
    type: TOPIC
    pattern: PREFIXED
    host: '*'
    principal: User:{{.}}_journal-extractor
    operation: CREATE, DESCRIBE, WRITE
    permission: ALLOW

And something similar would also clean up the custom user acls by pretty much the same amount.

wolever commented 3 years ago

For what it's worth, I've been using jsonnet and been happy with it (… at least, as happy as one can be when they need to template JSON).

They even have a tool to convert YAML -> JSON to make the import easier: https://jsonnet.org/articles/kubernetes.html

Here's a bit of the config I use:

local envswitch(vals) = vals[std.extVar('env')];

local config = {
  env: std.extVar('env'),
  admin_user: envswitch({
    'local': '',
    dev: 'User:160482',
  }),
  rw_user: envswitch({
    'local': '',
    dev: 'User:160484',
  }),
  ro_user: envswitch({
    'local': '',
    dev: 'User:160485',
  }),
  partitions: 3,
  replicas: 3,
};

local ACL(base) = {
  pattern: 'LITERAL',
  host: '*',
  permission: 'ALLOW',
} + base;

local ifNotLocal(obj) = if config.env != 'local' then obj else {};

[{
  // Topic config reference:
  //   https://docs.confluent.io/platform/current/installation/configuration/topic-configs.html
  topics: {
    'example': {
      partitions: config.partitions,
      replication: config.replicas,
      configs: {
        'cleanup.policy': 'delete',
        'retention.ms': 15 * 24 * 60 * 60 * 1000,
      },
    },
  },

  users: ifNotLocal({
    'admin-user': {
      principal: config.admin_user,
    },
    'read-write-user': {
      principal: config.rw_user,
    },
    'read-only-user': {
      principal: config.ro_user,
    },
  }),

  customUserAcls: ifNotLocal({
    'admin-user': (
      {
        'cluster-alter': ACL({
          type: 'CLUSTER',
          name: 'kafka-cluster',
          operation: 'ALTER',
        }),
      } +
      {
        ['topic-' + op]: ACL({
          type: 'TOPIC',
          name: '*',
          operation: op,
        })
        for op in ['CREATE', 'ALTER', 'DELETE', 'DESCRIBE', 'DESCRIBE_CONFIGS', 'ALTER_CONFIGS', 'READ', 'WRITE']
      }
    ),
    'read-write-user': (
      {
        ['topic-' + op]: ACL({
          type: 'TOPIC',
          name: '*',
          operation: op,
        })
        for op in ['DESCRIBE', 'DESCRIBE_CONFIGS', 'READ', 'WRITE']
      } +
      {
        ['group-' + group + '-' + op]: ACL({
          type: 'GROUP',
          name: group + '.',
          pattern: 'PREFIXED',
          operation: op,
        })
        for op in ['READ', 'WRITE', 'CREATE']
        for group in ['ephemeral', 'app']
      }
    ),
    'read-only-user': (
      {
        ['topic-' + op]: ACL({
          type: 'TOPIC',
          name: '*',
          operation: op,
        })
        for op in ['DESCRIBE', 'DESCRIBE_CONFIGS', 'READ']
      } +
      {
        ['group-' + group + '-' + op]: ACL({
          type: 'GROUP',
          name: group + '.',
          pattern: 'PREFIXED',
          operation: op,
        })
        for op in ['READ', 'WRITE', 'CREATE']
        for group in ['ephemeral', 'ro']
      }
    ),
  }),
}]
winterelf commented 2 years ago

Hi, We are also evaluating this in order to use it, we encounter some enhancement which will make this project really fit to devops team solution: 1 - export state of existing cluster: most companies will start using the tool with existing kafka cluster which already have ACLs and topics, it will be crazy to map the whole cluster into a state file manually.

  1. templating the whole project so that each dev team can update their own state values (not state file) which later will injected into the template state file.
  2. defaults - enable defaults values in ACLs and Topics etc, so we wouldn't need to repeat lots of code. Hope it helps :)