sfu-db / dbt-lineagex

23 stars 2 forks source link

Add Description Display to dbt-lineagex Lineage Graphs #3

Open KojiAndoJC opened 2 months ago

KojiAndoJC commented 2 months ago

I would like to propose an enhancement to the dbt-lineagex plugin to include dbt model descriptions within the lineage graph.

In Japan, many data professionals are more comfortable with Japanese than English. Our typical database schema design process involves:

To illustrate this, I will attach an image adapted from the help documentation of A5:SQL Mk-2, a popular SQL development and ER diagram tool in Japan. This tool displays logical names alongside physical names in ER diagrams, demonstrating the clarity this feature brings. The original help page, written in Japanese, can be found here.

ERD

Implementing a similar feature in dbt-lineagex to show model descriptions directly in the lineage graphs would greatly benefit users like myself. While this feature might not be essential for everyone, a practical approach could be to display descriptions using HTML's title attribute for tooltips.

I would greatly appreciate your consideration of this feature request.

zshandy commented 2 months ago

Hi,

That is a very interesting suggestion, and I do think it can be proven useful for other languages as well. I have a follow-up question: the comment that I should use, is it from column description that is present in the database or is it the column description in the schema.yml for dbt(https://docs.getdbt.com/reference/resource-properties/description)?

I am asking this because the graph is solely created by reading the manifest.json, no other file is read, so I was wondering if the description is supposed to be coming from the schema.yml, would it be present in manifest.json? But if the description is coming from the database, it is easier to implement since it could just read from database and append to the name.

Cheers,

KojiAndoJC commented 2 months ago

Thank you for your interest and for raising this important question.

For our purposes, either source—database comments or descriptions in dbt’s schema.yml—would be suitable. We typically synchronize these using the persist_docs configuration, as outlined in the dbt documentation (https://docs.getdbt.com/reference/resource-configs/persist_docs).

From what I've checked, it appears that manifest.json does include table and column descriptions, which would be useful for integrating these into the lineage graphs. Confirming this detail would still be beneficial.

zshandy commented 2 months ago

I see, is it possible to show a snippet/sample manifest.json and its corresponding dbt models, so that when I am developing, I can test against it. Also, I am currently in the middle of another project, but I expect to resume in a month or so.

Cheers,

KojiAndoJC commented 2 months ago

Thank you for your response. I can't share the actual project code due to confidentiality constraints, but I'm working on putting together a sample project that includes "logical names" for your testing. It might take a little time to assemble, but I'll make sure it's ready for you to use when you return to this project. Thanks for your understanding.

zshandy commented 2 months ago

Yes, I completely understand the confidentiality, and a sample(extremely simple) project would be perfect, a few tables is good enough!