We've been using the hive-metastore listener to track all DDLs running on the server that generates a payload sent to the Kafka topic in order for the Apache Atlas can read them and create the objects in your ecosystem.
Using the lib we faced a problem with the creation of tables based on Json schema, actually, the creation of the table worked as expected, but the listener could not get the json schema, given the following error:
2021-08-06T18:26:43,438 ERROR [pool-8-thread-2] metadata.Table: Unable to get field from serde: org.apache.hive.hcatalog.data.JsonSerDe
java.lang.NullPointerException: null
In the first shot, that error looks like something related to the listener code, but after realizing plenty of tests with many kinds of changes in the listener, I could not find any idea of the reason for this specific error: a mere NullPointer :disappointed:
After that, I tried to get the object created by the listener and did a comparison with the object saved in the hive-metastore database through the hive-metastore client, and I noticed that there were different objects for the same table, so I could see that some default values for TableBuilder and SerdeInfoBuilder were the reason for the specific error above.
What? :wrench:
SerdeInfoBuilder
Check if the attribute parameters is None, if yes the attribute is equal to an empty dict.
TableBuilder
Check if the attribute parameters is None, if yes the attribute is equal to an empty dict.
Check if the attribute partition_keys is None, if yes the attribute is equal to an empty list.
Type of change :file_cabinet:
[x] New feature (non-breaking change which adds functionality)
How everything was tested? :straight_ruler:
Ran the create table event with those hard-coded attributes and checked the hive-metastore log that got the json schema.
Checklist :memo:
[x] I have added labels to distinguish the type of pull request.
[x] My code follows the style guidelines of this project (docstrings, type hinting and linter compliance);
[x] I have performed a self-review of my own code;
[ ] I have made corresponding changes to the documentation;
[ ] I have added tests that prove my fix is effective or that my feature works;
[x] I have made sure that new and existing unit tests pass locally with my changes;
Why? :open_book:
We've been using the hive-metastore listener to track all DDLs running on the server that generates a payload sent to the Kafka topic in order for the Apache Atlas can read them and create the objects in your ecosystem.
Using the lib we faced a problem with the creation of tables based on Json schema, actually, the creation of the table worked as expected, but the listener could not get the json schema, given the following error:
In the first shot, that error looks like something related to the listener code, but after realizing plenty of tests with many kinds of changes in the listener, I could not find any idea of the reason for this specific error: a mere NullPointer :disappointed:
After that, I tried to get the object created by the listener and did a comparison with the object saved in the hive-metastore database through the hive-metastore client, and I noticed that there were different objects for the same table, so I could see that some default values for TableBuilder and SerdeInfoBuilder were the reason for the specific error above.
What? :wrench:
SerdeInfoBuilder
TableBuilder
Type of change :file_cabinet:
How everything was tested? :straight_ruler:
Ran the create table event with those hard-coded attributes and checked the hive-metastore log that got the json schema.
Checklist :memo: