TheDataStation / ver

Data Discovery Tools and Systems
MIT License
6 stars 10 forks source link

Unified Profile Schema #46

Closed snowgy closed 1 year ago

snowgy commented 1 year ago

ddprofiler produces a set of profiles for each dataset. The schema(fields) of a profile should be very clear for users. In the future, we may also provide a way for users to edit the profile schema.

In the current version of Ver, the dindex_builder module retains a profile schema file that is hard-coded (located at dindex_builder/profile_index_schema_duckdb.txt). Our objective is to establish a unified profile schema that can be utilized by both ddprofiler and dindex_builder, while also being easily understandable for users.

Once a unified profile schema is established, there is no need to modify the code in the dindex_builder when adding a new field to the ddprofiler. Moreover, as a user, if I want to know the fields that the ddprofiler profiles, I can conveniently access this information by referring to the schema file.