odpi / egeria

Egeria core
https://egeria-project.org
Apache License 2.0
795 stars 259 forks source link

[BUG] KafkaOpenMetadataTopicConnector requires both producer and consumer config #7409

Open juergenhemelt opened 1 year ago

juergenhemelt commented 1 year ago

Existing/related issue?

https://github.com/odpi/egeria-connector-integration-lineage-event-driven-sample/issues/42

Current Behavior

Whenever a connector is configured and a configuration for the producer and/or consumer is not passed, default values will be used as described here: https://egeria-project.org/connectors/resource/kafka-open-metadata-topic-connector/?h=kafka+op#default-properties-for-the-producer-and-consumer

If the connector is only used for reading and not for writing, a producer is gratuitous. But unless you configure both the consumer and the producer the connector will not start It checks the availability of the Kafka brokers for the consumer and the producer and if the producer is not configured it takes localhost:9092as default. Unless you have a local Kafka cluster the startup will fail.

Expected Behavior

There should be a configuration to tell the connector if it is for reading, writing or both. Depending on this configuration the availability of the Kafka broker(s) should only be checked for the consumer and/or producer config.

Steps To Reproduce

I used the SampleLineageEventReceiverIntegrationConnector and configured only a consumer and not a producer for the embedded OpenMetadataTopicConnector.

Environment

- Egeria: 3.15
- OS: Linux
- Java: 11.x
- Browser (for UI issues):
- Additional connectors and integration:

Any Further Information?

Workaround is to configure both a consumer and a producer with the same Kafka properties.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

mandy-chessell commented 1 year ago

I think this one should be pursued ...

mandy-chessell commented 1 year ago

The aim stated above is to minimize the administration effort. In addition, there is potentially a runtime performance improvement if we can avoid starting a producer or consumer unnecessarily.

Implementing this fix needs some thought on when the information is supplied that determines whether the Kafka topic connector should start either the producer or the consumer or both. The Kafka topic connector is just one implementation of the open metadata topic connector. Changes to the main runtime code need to work for other event bus technology too.

In general, we would expect configuration for topics to occur:

Most of the helper methods that configure topics in the configuration document use the event bus config to provide default values for the open metadata topic connector in use. Since this is used in multiple places (and not at runtime), it needs information for both the producer and the consumer.

The code that creates the open metadata topic connection always knows whether it is sending/receiving events (or both). Therefore is should be possible to pass an architected configuration property that is part of the Open Metadata Topic Connector interface that can be interpreted as appropriate by each implementation. This value would be set by the consuming code that knows which direction events are flowing. If the value is not set then it is assumed that events are flowing in both directions.

mandy-chessell commented 1 year ago

Names for the new configuration property that any implementation of the OpenMetadataTopicConnector can choose to implement.

    public static final String  EVENT_DIRECTION_PROPERTY_NAME = "eventDirection";
    public static final String  EVENT_DIRECTION_INOUT   = "inOut";
    public static final String  EVENT_DIRECTION_OUT_ONLY = "outOnly";
    public static final String  EVENT_DIRECTION_IN_ONLY  = "inOnly";

These values are found in the OpenMetadataTopicProvider.

The KafkaOpenMetadataTopicProvider extends it recognized configuration properties:

        List<String>  recognizedPropertyNames = new ArrayList<>();

        recognizedPropertyNames.add(producerPropertyName);
        recognizedPropertyNames.add(consumerPropertyName);
        recognizedPropertyNames.add(serverIdPropertyName);
        recognizedPropertyNames.add(sleepTimeProperty);
        recognizedPropertyNames.add(OpenMetadataTopicProvider.EVENT_DIRECTION_PROPERTY_NAME);

        connectorType.setRecognizedConfigurationProperties(recognizedPropertyNames);

The kafka implementation of the OpenMetadataTopicConnector considers the event direction when checking if the brokers are up (does it look in the producer and/or consumer properties for the broker addresses?) and when starting the producer and consumer threads.

The configuration helper classes:

The startup logic for the access services sets up the event direction appropriately in the connection when it is about the create the connector (and knows whether it is an inTopic or OutTopic and whether it is for the client or server side.

This required a small change to the multi-tenant support because Data Engine OMAS is passing its InTopic connection in the parameter for the outTopic.