datahub-project / datahub

The Metadata Platform for your Data and AI Stack
https://datahubproject.io
Apache License 2.0
9.93k stars 2.94k forks source link

feat(ingest/cassandra): Add support for Cassandra as a source #11822

Closed sagar-salvi-apptware closed 6 days ago

sagar-salvi-apptware commented 1 week ago

Checklist

jjoyce0510 commented 1 week ago

Things to confirm:

Cassandra Data Type DataHub Data Type
ascii StringType
bigint NumberType
blob BytesType
boolean BooleanType
counter NumberType
date DateType
decimal NumberType
double NumberType
float NumberType
inet StringType
int NumberType
list ArrayType
map MapType
set ArrayType
smallint NumberType
text StringType
time TimeType
timestamp DateType
timeuuid StringType
tinyint NumberType
tuple ArrayType
uuid StringType
varchar StringType
varint NumberType
frozen<map<text, text>> MapType
frozen<list> ArrayType
frozen<set> ArrayType
jjoyce0510 commented 1 week ago

set<?> -> ArrayType list<?> -> ArrayType map<?> -> MapType

And UDT -> Flattened structure

jjoyce0510 commented 1 week ago

Views & Tables should have the same aspects

And ensure we have table + column comments as description fields.

.....And

Views have

jjoyce0510 commented 1 week ago

Table + Column Comments should be QA'd locally.

jjoyce0510 commented 1 week ago

Also please make sure error handling is clear. And try-excepts are around areas where we need. This connector should only fail in known cases. We should not have ANY uncaught exceptions!