RedPillAnalytics / gradle-confluent

Apache License 2.0
22 stars 10 forks source link

Support Variable Replacements #130

Open mkrain opened 10 months ago

mkrain commented 10 months ago

Is your feature request related to a problem? Please describe.

Certain elements of a sql file need to be dynamic, i.e. partition count, replicas, format and etc, to support different deployment environments. However, the current plugin does not support variable substitution. This is supported in ksqlDB, see [https://docs.ksqldb.io/en/latest/how-to-guides/substitute-variables/#use-a-variable](Using Variables), but requires access to the ksql DB API and/or CLI.

Describe the solution you'd like

One thought is to allow for specifying a mapping file that allows the plugin to DEFINE and UNDEFINE the mappings during the sql creation/deployment. I.e.

pipelineExecute --mapping-file environment.json

This would need to be supported from the maven package and via local execution. The result would be baked into the ksql,

DEFINE ${key_0}='${value_0}';
...
UNDEFINE ${key};

Which would produce (based on the file that follows)

DEFINE streamName='prod_user_mapping_stream';
DEFINE colName1='id';
DEFINE colName2='name';
DEFINE topicName='prod_user_mapping';
DEFINE format='JSON';
DEFINE replicas='3';

CREATE STREAM ${streamName} (
  ${colName1} INT,
  ${colName2} STRING
) WITH (
  kafka_topic = '${topicName}',
  format = '${format}',
  replicas = ${replicas},
  ...
);

UNDEFINE streamName;
UNDEFINE colName1;
UNDEFINE colName2;
UNDEFINE topicName;
UNDEFINE format;
UNDEFINE replicas;

The mapping file, json defined here, is a set of key-value-pairs

{
  "streamName": "prod_user_mapping_stream",
  "colName1": "id",
  "colName2": "name",
  "topicName": "prod_user_mapping",
  "format": "JSON",
  "replicas": 3
}

Describe alternatives you've considered

When running locally using the variable substitution as described in the link above works as expected. However, our use case is to bundle all of the sql files and store them in a repository. This file is than pulled down and deployed in a separate step using the pipelineExecute command via the maven functionality, to multiple environments. There is no opportunity to influence the contents of this sql file once they've been zipped. This means that the sql files contain hard coded values that are the same regardless of environment deployed to.