Open jmbanda opened 8 years ago
Yes, this is an important requirement I know a lot of people have missed. Of course, it doesn't just apply to this study, but to all OHDSI tools and analyses.
@cgreich, any thoughts on how make sure people use NULL values for invalid_reason
when loading the vocab?
Yeah, that's tricky. In Oracle, '' and NULL are identical. Almost.
As a solution for this and the general confusion around deprecated relationships I was thinking to just taking out all those that are NOT NULL from the normal distribution. It would contain only valid records in СONCEPT_RELATIONSHIP. It would also shrink those zip files substantially. There is a valid need to have deprecated Concepts around, but I haven't found the need for the deprecated relationships. Could you think of any?
I can't think of any, but that doesn't completely remove the problem (we usually also check the invalid_reason
for concepts).
Probably a stupid idea, but why not have a specific value for valid records? (e.g. invalid_reason = 'V'
. Maybe change the name of the variable?), Assigning a specific meaning to NULL (as opposed to '') is asking for trouble.
I would propose to attach a loading script with the vocabulary instead of making any changes to the code for applications/studies. This way it will be loaded the way it is expected (or fixed) with a simple check like: update concept set invalid_reason = NULL where invalid_reason=''.
Juan M. Banda, Ph.D. Research Scientist - Center for Biomedical Informatics Research Stanford University School of Medicine
Google Scholar: http://goo.gl/RiO2jY | ResearchGate: http://goo.gl/7C31uc DBLP: http://goo.gl/BBvmJj | LinkedIn: http://goo.gl/h95tnq
On Mon, Jun 20, 2016 at 5:29 AM, Martijn Schuemie notifications@github.com wrote:
I can't think of any, but that doesn't completely remove the problem (we usually also check the invalid_reason for concepts).
Probably a stupid idea, but why not have a specific value for valid records? (e.g. invalid_reason = 'V'. Maybe change the name of the variable?), Assigning a specific meaning to NULL (as opposed to '') is asking for trouble.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/OHDSI/StudyProtocols/issues/7#issuecomment-227128461, or mute the thread https://github.com/notifications/unsubscribe/AJ8OiHZ7QdFjJPv9X5gBwKHFgd9_-1D5ks5qNoeVgaJpZM4I0-LW .
Looks like the SQL cohort code has the restriction of invalid_reason = NULL. Depending on the RDMBS flavor and how the vocabulary files are loaded this will be an issue. Loading the Athena files as-is on PostgreSQL will make invalid reason ='' rather than null.
While this is an easy fix, it might be worth making a note in the vocabulary load instructions that this should be the case. Otherwise, the study runs in a few seconds and gets no patients.