OHDSI / dbt-synthea

[Under development] A dbt ETL project to convert a Synthea synthetic data set into the OMOP CDM
https://ohdsi.github.io/dbt-synthea/
Apache License 2.0
16 stars 6 forks source link

feat: add documentation and tests for `person` table #52

Closed lawrenceadams closed 1 month ago

lawrenceadams commented 2 months ago

Closes #53


Thoughts

This PR may break the DBT Power User extension - using dbt_utils.relationships_where seems to upset the documentation editor when trying to save the yml file.

Given that if we apply documentation and tests to all tables/columns in the CDM the associated documentation file will explode in size anyway, maybe it is worth breaking out each table's documentation/testing into their own files? Happy to hear other's thoughts!

katy-sadowski commented 2 months ago

Thank you, @lawrenceadams ! This is great.

Question about breaking dbt Power User - do you mean that using dbt_utils tests breaks all of its functionality? Or just that it'll complain about yml files where these are used? If the latter, I think I'm ok with it because these look super useful.

I like the idea of breaking out each model config into its own file. Maybe as we add full tests & docs for each model, we can migrate the model to its own file. If you'd like to do that for person as part of this PR, that'd be great! Thank you!

lawrenceadams commented 2 months ago

Ahh yes - I should have expanded sorry, it doesn't break all the functionality - but upsets the ability to edit the yml file with the docs editor, it seems to get confused by the field that are used by the test dbt_utils.relationships_where when you save:

image

I had a quick look through the dbt poweruser code and couldn't make sense why this error is thrown - but when I remove the relationships_where tests it seems to go away...

I hope that makes sense!

I think splitting them out helps keep it manageable and its easy enough to edit the files by hand

katy-sadowski commented 2 months ago

Thanks for clarifying! I think it's fine to keep the dbt_utils tests in this case. Personally I don't find the Power User docs editor that useful and agree editing by hand in the yml is fine.

lawrenceadams commented 1 month ago

@katy-sadowski I think I might as well try and do all of them right now! Let me see how I get on... I don't see the point of merging this in until then!

katy-sadowski commented 1 month ago

Thank you @lawrenceadams ! I learned from @ap37348 (a new friend 😃 ) that he might have a script to auto-generate these yamls based on the OMOP CDM configs in https://github.com/OHDSI/commondatamodel. So if this is proving an onerous task, maybe let's see if he's able to share that out and save you some time!

lawrenceadams commented 1 month ago

That would be great! I realised last night that some stuff in the CDM docs is wrong and this is a silly approach given the html is actually quite neat and parsable in this case!

On 25 Sep 2024, at 01:18, Katy Sadowski @.***> wrote: Thank you @lawrenceadams ! I learned from @ap37348 (a new friend 😃 ) that he might have a script to auto-generate these yamls based on the OMOP CDM configs in https://github.com/OHDSI/commondatamodel. So if this is proving an onerous task, maybe let's see if he's able to share that out and save you some time!

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

lawrenceadams commented 1 month ago

I am questioning my previous self, as looking at the HTML in detail it is very automatable... I think I only started by hand to explore the process / what tests were needed and did not think of just automating the whole thing.

Has been a useful exercise finding broken links etc in the OMOP cdm docs however!

I will close this branch as its getting chaotic and will start again - will be much better!

ap37348 commented 1 month ago

Hey Katy,

I’ve been out on paternity leave. I will be back online a little next week and will push some changes to the git repo.

Best, Andrew

From: Katy Sadowski @.> Date: Tuesday, September 24, 2024 at 7:18 PM To: OHDSI/dbt-synthea @.> Cc: Payne, Andrew @.>, Mention @.> Subject: Re: [OHDSI/dbt-synthea] feat: add documentation and tests for person table (PR #52)

Thank you @lawrenceadamshttps://github.com/lawrenceadams ! I learned from @ap37348https://github.com/ap37348 (a new friend 😃 ) that he might have a script to auto-generate these yamls based on the OMOP CDM configs in https://github.com/OHDSI/commondatamodel. So if this is proving an onerous task, maybe let's see if he's able to share that out and save you some time!

— Reply to this email directly, view it on GitHubhttps://github.com/OHDSI/dbt-synthea/pull/52#issuecomment-2372615230, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AP3HS55FDXUUPQXPTCDOVXLZYH6NVAVCNFSM6AAAAABOG5ERG2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZSGYYTKMRTGA. You are receiving this because you were mentioned.Message ID: @.***>

katy-sadowski commented 1 month ago

Hey Katy, I’ve been out on paternity leave. I will be back online a little next week and will push some changes to the git repo. Best, Andrew

Oh, congrats! No rush at all. @lawrenceadams already whipped up a script here: https://github.com/OHDSI/dbt-synthea/pull/60. I'm sure it'd be interesting for you to check out and compare the approach to yours. Feel free to comment on the PR or open another with additions/changes :)