open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.21k stars 990 forks source link

Ingestion: Kafka protobuf schema parsing only works if payload type is named the same as the topic #15274

Open MattMasuda opened 7 months ago

MattMasuda commented 7 months ago

Affected module Ingestion

Describe the bug When ingesting from Kafka, schema parsing fails unless the payload type name is similar to the topic name. For example, in a topic named address_book the payload module must be named AddressBook. Otherwise the parsing fails with a warning similar to this:

[2024-02-20T16:10:40.346+0000] {common_broker_source.py:104} INFO - Fetching topic config loans
[2024-02-20T16:10:40.497+0000] {protobuf_parser.py:165} WARNING - Unable to create protobuf python module for loans: module 'loans_pb2' has no attribute 'Loans'
[2024-02-20T16:10:40.499+0000] {protobuf_parser.py:200} WARNING - Unable to parse protobuf schema for loans: 'NoneType' object has no attribute 'DESCRIPTOR'

In the OMD UI the result is that only the text version of the schema appears. No fields are displayed and there's no way to add tags or glossary terms to the schema:

image

To Reproduce

  1. In Kafka, create a topic.
  2. Using schema registry, create a Protobuf schema for the topic but give the top-level message a name other than the topic name.
  3. Run an ingestion against the Kafka cluster.
  4. In the UI the topic will only have the Protobuf schema in text format.

Expected behavior The full schema with fields, tags, and glossary terms should be displayed for the topic.

Version:

Additional context N/A

harshach commented 1 month ago

@ulixius9 @OnkarVO7 we might run into this issue quickly. Lets target for 1.6.0