datnguye / dbterd

Generate the ERD as a code from dbt artifacts
https://dbterd.datnguyen.de/
MIT License
208 stars 30 forks source link

[FEAT] Select criteria using tags in models #57

Closed philiphagglund closed 7 months ago

philiphagglund commented 1 year ago

Is your feature request related to a problem? Please describe. It seems not to be possible to select models by using one or multiple tag.

Describe the solution you'd like Adding the functionality to run dbterd over tags specified in models metadata.

models:
  - name: test
    config:
      materialized: incremental
      tags: test, test2
    columns:
      - name: test_key
        tests:
          - relationships_test:
              to: ref('test2')
              field: test_key

dbterd run -ad target -s tag:test or a list of tags
dbterd run -ad target -s tag:test,test2

datnguye commented 1 year ago

Hey @philiphagglund Thanks for raising the request ๐Ÿ‘

Could you help to check --dbt option here? It should enable to use dbt native selection including tags. Let me know otherwise ๐Ÿ˜Š

philiphagglund commented 1 year ago

It works fine using the --dbt option, thanks for that!

Another thing that accord when i was running the --dbt command was the following scenario:

Running dbt docs

$ dbt docs generate
05:16:05  Running with dbt=1.6.5
05:16:06  oracle adapter: Running in thick mode
05:16:06  Registered adapter: oracle=1.6.0
05:16:07  Found 1194 models, 40 seeds, 190 tests, 1 snapshot, 103 sources, 0 exposures, 0 metrics, 523 macros, 0 groups, 0 semantic models
05:16:07
05:16:15  Concurrency: 20 threads (target='dev')
05:16:15
05:16:41  Building catalog
05:17:06  Catalog written to C:\dev\kod\dbt.projects\dv\target\catalog.json

Running the dbt run command works fine on artifacts.

$ dbterd run                                                                                                                                                                                                                                                          
2023-10-26 07:17:19,623 - dbterd - INFO - Run with dbterd==1.7.0 (main.py:54)
2023-10-26 07:17:19,624 - dbterd - INFO - Using dbt project dir at: C:\dev\kod\dbt.projects\dv (base.py:45)        
2023-10-26 07:17:19,625 - dbterd - INFO - Using dbt artifact dir at: C:\dev\kod\dbt.projects\dv\target (base.py:69)
2023-10-26 07:17:22,352 - dbterd - INFO - Collected 1194 table(s) and 80 relationship(s) (test_relationship.py:59)
2023-10-26 07:17:22,412 - dbterd - INFO - C:\dev\kod\dbt.projects\dv\target/output.dbml (base.py:166)

But running the dbt run --dbt -s tag:link will fail running on same artifacts as above but will work if using the --dbt-auto-artifacts added.

$ dbterd run --dbt -s tag:link
2023-10-26 07:20:57,227 - dbterd - INFO - Run with dbterd==1.7.0 (main.py:54)
2023-10-26 07:20:57,227 - dbterd - INFO - Using dbt project dir at: C:\dev\kod\dbt.projects\dv (base.py:45)
2023-10-26 07:20:57,227 - dbterd - DEBUG - Found dbt v1.6.5 installed at C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbt (dbt_invocation.py:78)
2023-10-26 07:20:57,243 - dbterd - DEBUG - Invoking: `dbt --quiet --log-level none ls --resource-type model --select tag:link --project-dir C:\dev\kod\dbt.projects\dv` at C:\dev\kod\dbt.projects\dv (dbt_invocation.py:44)
dv.dv.case_application_l
2023-10-26 07:20:59,898 - dbterd - INFO - Using dbt artifact dir at: C:\dev\kod\dbt.projects\dv\target (base.py:69)
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Scripts\dbterd.exe\__main__.py", line 7, in <module>
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\main.py", line 6, in main
    cli.dbterd()
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\cli\params.py", line 117, in wrapper
    return func(*args, **kwargs)  # pragma: no cover
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\cli\main.py", line 75, in run
    Executor(ctx).run(**kwargs)
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\base.py", line 30, in run
    self.__run_by_strategy(**kwargs)
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\base.py", line 162, in __run_by_strategy
    result = operation(manifest=manifest, catalog=catalog, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\targets\dbml\dbml_test_relationship.py", line 16, in run
    return ("output.dbml", parse(manifest, catalog, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\targets\dbml\dbml_test_relationship.py", line 29, in parse
    tables, relationships = test_relationship.parse(
                            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\algos\test_relationship.py", line 23, in parse
    tables = base.get_tables(manifest=manifest, catalog=catalog, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\algos\base.py", line 41, in get_tables
    table = get_table(
            ^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\algos\base.py", line 100, in get_table
    database=manifest_node.database.lower(),
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'lower'
$ dbterd run --dbt -s tag:link --dbt-auto-artifacts
2023-10-26 07:21:40,288 - dbterd - INFO - Run with dbterd==1.7.0 (main.py:54)
2023-10-26 07:21:40,304 - dbterd - INFO - Using dbt project dir at: C:\dev\kod\dbt.projects\dv (base.py:45)
2023-10-26 07:21:40,304 - dbterd - DEBUG - Found dbt v1.6.5 installed at C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbt (dbt_invocation.py:78)
2023-10-26 07:21:40,304 - dbterd - DEBUG - Invoking: `dbt --quiet --log-level none ls --resource-type model --select tag:link --project-dir C:\dev\kod\dbt.projects\dv` at C:\dev\kod\dbt.projects\dv (dbt_invocation.py:44)
dv.dv.case_application_l
2023-10-26 07:21:42,905 - dbterd - DEBUG - Invoking: `dbt --quiet --log-level none docs generate --project-dir C:\dev\kod\dbt.projects\dv` at C:\dev\kod\dbt.projects\dv (dbt_invocation.py:44)
2023-10-26 07:22:27,739 - dbterd - INFO - Using dbt artifact dir at: C:\dev\kod\dbt.projects\dv/target (base.py:69)
2023-10-26 07:22:29,893 - dbterd - INFO - Collected 1 table(s) and 0 relationship(s) (test_relationship.py:59)
2023-10-26 07:22:29,894 - dbterd - INFO - C:\dev\kod\dbt.projects\dv\target/output.dbml (base.py:166)

I canยดt seem to understand why?

datnguye commented 1 year ago

Hi @philiphagglund Looks really weird for me on this behavior!

Just an initial guessing that dbt ls triggered by dbt Programatic Invocation (because of --dbt option) has broken the manifest.json file which is previously genereted by your dbt docs generate command.

Are you able to confirm if the below works fine?

dbt docs generate
dbterd run --dbt -s tag:link

I will try to look deeper once I get a chance. Thanks for flagging it out! ๐Ÿ‘

philiphagglund commented 1 year ago

Hi!

It doesn't work using:

dbt docs generate
dbterd run --dbt -s tag:hub

Result

$ dbterd run --dbt -s tag:hub
2023-10-26 16:58:59,044 - dbterd - INFO - Run with dbterd==1.7.0 (main.py:54)
2023-10-26 16:58:59,046 - dbterd - INFO - Using dbt project dir at: C:\dev\kod\dbt.projects\dv (base.py:45)
2023-10-26 16:58:59,049 - dbterd - DEBUG - Found dbt v1.6.5 installed at C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbt (dbt_invocation.py:78)
2023-10-26 16:58:59,049 - dbterd - DEBUG - Invoking: `dbt --quiet --log-level none ls --resource-type model --select tag:hub --project-dir C:\dev\kod\dbt.projects\dv` at C:\dev\kod\dbt.projects\dv (dbt_invocation.py:44)
dv.dv.application_h
dv.dv.case_h
2023-10-26 16:59:22,139 - dbterd - INFO - Using dbt artifact dir at: C:\dev\kod\dbt.projects\dv\target (base.py:69)
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Scripts\dbterd.exe\__main__.py", line 7, in <module>
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\main.py", line 6, in main
    cli.dbterd()
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\cli\params.py", line 117, in wrapper
    return func(*args, **kwargs)  # pragma: no cover
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\cli\main.py", line 75, in run
    Executor(ctx).run(**kwargs)
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\base.py", line 30, in run
    self.__run_by_strategy(**kwargs)
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\base.py", line 162, in __run_by_strategy
    result = operation(manifest=manifest, catalog=catalog, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\targets\dbml\dbml_test_relationship.py", line 16, in run
    return ("output.dbml", parse(manifest, catalog, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\targets\dbml\dbml_test_relationship.py", line 29, in parse
    tables, relationships = test_relationship.parse(
                            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\algos\test_relationship.py", line 23, in parse
    tables = base.get_tables(manifest=manifest, catalog=catalog, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\algos\base.py", line 41, in get_tables
    table = get_table(
            ^^^^^^^^^^
  File "C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbterd\adapters\algos\base.py", line 100, in get_table
    database=manifest_node.database.lower(),
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'lower'

But running the dbterd run --dbt -s tag:hub --dbt-auto-artifacts works directly.

$ dbterd run --dbt -s tag:hub -s tag:link --dbt-auto-artifacts
2023-10-26 17:01:42,193 - dbterd - INFO - Run with dbterd==1.7.0 (main.py:54)
2023-10-26 17:01:42,195 - dbterd - INFO - Using dbt project dir at: C:\dev\kod\dbt.projects\dv (base.py:45)
2023-10-26 17:01:42,197 - dbterd - DEBUG - Found dbt v1.6.5 installed at C:\Users\b605xd\AppData\Local\Programs\Python\Python311\Lib\site-packages\dbt (dbt_invocation.py:78)
2023-10-26 17:01:42,197 - dbterd - DEBUG - Invoking: `dbt --quiet --log-level none ls --resource-type model --select tag:hub tag:link --project-dir C:\dev\kod\dbt.projects\dv` at C:\dev\kod\dbt.projects\dv (dbt_invocation.py:44)
dv.dv.application_h
dv.dv.case_application_l
dv.dv.case_h
2023-10-26 17:01:45,360 - dbterd - DEBUG - Invoking: `dbt --quiet --log-level none docs generate --project-dir C:\dev\kod\dbt.projects\dv` at C:\dev\kod\dbt.projects\dv (dbt_invocation.py:44)
2023-10-26 17:02:44,513 - dbterd - INFO - Using dbt artifact dir at: C:\dev\kod\dbt.projects\dv/target (base.py:69)
2023-10-26 17:02:48,620 - dbterd - INFO - Collected 3 table(s) and 2 relationship(s) (test_relationship.py:59)
2023-10-26 17:02:48,622 - dbterd - INFO - C:\dev\kod\dbt.projects\dv\target/output.dbml (base.py:166)

I am in line with yopur first initial guess regarding the dbt ls command and manifest/catalog files. Seems to be the only difference between using the --dbt-auto-artifacts.

datnguye commented 1 year ago

Thanks for the very detailed info!

Unfortunately I couldn't reproduce the issue using dbt-postgres 1.6 and dbterd 1.7 as I am not having Oracle adapter environment ready as of now.

My conclusion for now is that we should:

mkdir ./target/save cp ./target/manifest.json ./target/save/manifest.json cp ./target/catalog.json ./target/save/catalog.json

dbterd run --dbt -ad ./target/save -s tag:hub -s tag:link