AbsaOSS / hyperdrive

Extensible streaming ingestion pipeline on top of Apache Spark
Apache License 2.0
44 stars 13 forks source link

Configuration property keys of components should be accessible via reflection #83

Closed kevinwallimann closed 4 years ago

kevinwallimann commented 4 years ago

Currently, the configuration property keys are hard-coded in the ingestor (or any custom component jar). This makes it impossible to infer the required configuration properties, given just the jars. It should be possible that an external program (e.g. the hyperdrive-trigger) can read the available configuration properties of a component (through reflection) if the jars are given.

To that end, every ComponentFactory should implement a method that returns a list of configuration properties. Furthermore, this list should also contain information whether a configuration property is required or optional, and some validation rules. Some components may depend on each other, e.g. the CheckpointOffsetManager requires the KafkaStreamReader to be configured (or at least the property reader.kafka.topic must be defined). This should be taken into account as well.

Possible breakdown of this issue:

  1. List of config properties with required/optional info.
  2. Add validation rules to list
  3. Provide e.g. another method to define dependencies on other components (multiple components, either component A or component B,...)

Migration from older versions Classes that implemented StreamReaderFactory, OffsetManagerFactory, StreamDecoderFactory, StreamTransformerFactory or StreamWriterFactory must now implement the trait HasComponentAttributes