Closed lawrenceadams closed 1 month ago
I do agree with adding this! Thanks for the initiative! I think I enabled Pages:
If this looks right, go ahead and merge to test it out!
It works!!! https://ohdsi.github.io/dbt-synthea
For some reason, the data types are not showing up in the GH Pages site, though they do show up when I build it locally. Quick Google says this might be a fix? https://github.com/dbt-labs/dbt-spark/issues/988
Excellent!! 🥳
For some reason, the data types are not showing up in the GH Pages site, though they do show up when I build it locally. Quick Google says this might be a fix? dbt-labs/dbt-spark#988
Interesting... I have can reproduce the same thing. The documentation on the docs is ironically thin on how you should do this.
I can confirm that if you use type
then the docs works. Buttt - the reason we don't have this issue locally is that if you have dbt run
/ dbt build
your project and have your warehouse available then dbt will grab the data type metadata from the database. I.e. if we update the workflow to do dbt build
then it will work fine.
Going down the rabbit hole, I am torn as to what to do:
data_type
is correct (type
is not valid)data_type
exists so that if you enable/enforce contracts then dbt will check the we have matched our expected schema on project build.
care_site
and failed as the ID uses bigint:
Interestingly, this should be handled by the adapter (unless this is for something else...):
But does not seem to work!
What do you think @katy-sadowski ? I think doing the dbt run
step before building the docs would be a way around this, and allow us to enforce schema checking in the future - but just using type
is another valid solution
I've gone too deep into the weeds, but essentially if we want contract enforcement to work then the dbt-duckdb
adapter needs to be updated to overload the DuckDBColumn
class here with an updated TYPE_LABELS
dict which would include mappings for 'BIGINT': 'INTEGER'
.
I have tested it as the below and confirm this builds and the contract enforcer code is happy when this is present:
I do think it would be cool to have contractually bound type checking using the CDM docs as the source but this would require upstream dbt-duckdb
changes, and I think BIGINT
is thought to be a distinctly different type to INT
(although I am not sure why, the contract feature allows type aliasing to be enabled or disabled at will.)
@katy-sadowski I will pull this out into a new issue and we can discuss there! 🧠
This PR adds automatic dbt docs generation for the
dbt-synthea
project.Why bother?
The docs are a nice place to start for those not used to either dbt / OMOP to explore the models/DAG/etc (IMHO - feel free to disagree!); is easier for us to point people to our project!
This approach requires little/no effort of maintainers once setup.
How does it work
Next steps