datnguye / dbterd

Generate the ERD as a code from dbt artifacts
https://dbterd.datnguyen.de/
MIT License
198 stars 28 forks source link

[FEAT] Replace special symbol characters with other characters to allow them to be drawn as mermaid #81

Closed syou6162 closed 7 months ago

syou6162 commented 7 months ago

Is your feature request related to a problem? Please describe. I am using BigQuery as my DWH. BigQuery sometimes has column types that contain < special characters such as Struct<first_name string, last_name string> or column names that contain . special characters, such as name.first_name in column names. Unfortunately, mermaid has a problem rendering markdown with such special characters.

An example would be something like this. Commas are also not allowed in mermaid 😭 .

erDiagram
  "my_package.user" {
    Struct<first_name string, last_name string> name
    string name.first_name
    string name.last_name
  }

Describe the solution you'd like Replace special characters with, for example

erDiagram
  "my_package.user" {
    Struct[OMITTED] name
    string name__first_name
    string name__last_name
  }

For Struct<first_name string, last_name string> name, there are three problems with <, , and white spaces, and I didn't have a good idea, so I could only come up with Struct[OMITTED] name.

Columns that have such problems can be handled by not displaying them at https://github.com/datnguye/dbterd/pull/77 in the first place. However, since . may appear in relationships, we would like to consider replacing it with __.

Describe alternatives you've considered Wait for mermaid to be modified to draw even when special characters are included...

Additional context N/A

datnguye commented 7 months ago

@syou6162 Looks good! For Struct<first_name string, last_name string> name, what about using slugify of string?

syou6162 commented 7 months ago

@datnguye Good idea, thanks!

If I slugify Struct<first_name string, last_name string> name, it becomes, for example, struct-first_name-string--last_name-string-. It didn't seem human readable, at least to me. This example is still barely readable, but Struct can be nested, and when it is, it is already difficult to read.

Fortunately, dbt has the exact information of the column type in yaml, and it is easy to output the yaml link to markdown. So I will try to deal with this in the form of omit for complex types to output in mermaid.

If anyone else feels this is not good enough or comes up with a better idea, I hope he/she will send another pull request :)