datastrato / gravitino

World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
https://datastrato.ai/docs/
Apache License 2.0
401 stars 166 forks source link

[Improvement] Add universal type support for Gravitino #2373

Open jerryshao opened 3 months ago

jerryshao commented 3 months ago

What would you like to be improved?

With the supported catalogs increased, Gravitino will face a problem that current type system cannot cover all the scenarios, like unsigned types, UDTs. In the meantime, we need to support DDL, so we should cover all the types.

How should we improve?

So to handle this problem, we should:

  1. Support UDTs, developers can define their own types based on our frameworks to support customized scenario.
  2. Support type variations to mapping the types to the target system.

We can refer to Substrait's type system (https://substrait.io/types/type_system/) to achieve our own.

CC @mchades @yuqi1129 @zhoukangcn

mchades commented 3 months ago

The issue discussed in #1761 is also something that needs to be considered for the universal type: the types in Gravitino have the same name as the types in the catalog, but they have different semantics.

jerryshao commented 3 months ago

Yeah, we need to have a complete solution to handle them all.

mchades commented 2 weeks ago

Here is the solution doc: https://docs.google.com/document/d/14GAFPzf6HZcEtFey8f6t387hh_oD_5S5ml-dvcH7xjA/edit?usp=sharing