Closed PRijnbeek closed 3 years ago
CPUs, mem, R version, list of installed packages done with benchmarkme package.
Check if all HADES packages are installed added, including message if packages are missing Return all results in list object
Check WebAPI is running added
List the topX unmapped source values per domain. This is a check for some 'low hanging fruit' to improve the mapping.
Count of concepts per vocabulary by standard, classification and non-standard.
List the topX unmapped source values per domain. This is a check for some 'low hanging fruit' to improve the mapping.
Yes, but if the source values are codes it may be less informative? Maybe we can join to source to concept and add description? Is that used?
Count of concepts per vocabulary by standard, classification and non-standard.
Not sure i follow. You mean a count of all vocabularies per domain:
Donain, vocabulary, nr patients, nr codes, standard
Standard should always be true
Yes, but if the source values are codes it may be less informative? Maybe we can join to source to concept and add description? Is that used?
Well, we need to know the source_vocabulary_id for that, and that information is not present in the event tables. Same issue btw to get the frequency for the codes in the source_to_concept_map table. e.g. if the source_code 1 is in multiple source vocabularies, then we cannot count them separately.
Not sure i follow. You mean a count of all vocabularies per domain: Donain, vocabulary, nr patients, nr codes, standard Standard should always be true
That might actually also be interesting, what target vocabularies were used per domain.
However, my suggestion was simpler; just counting in the concept table (select count(*) from concept group by vocabulary_id, standard_concept
). For instance, this would show whether the CPT4 vocabulary was actually loaded (this is an additional step in the vocab loading process). But could also show other anomalies with the vocabulary loading.
Well, we need to know the source_vocabulary_id for that, and that information is not present in the event tables. Same issue btw to get the frequency for the codes in the source_to_concept_map table. e.g. if the source_code 1 is in multiple source vocabularies, then we cannot count them separately.
Can you give me the query you like to see?
However, my suggestion was simpler; just counting in the concept table (
select count(*) from concept group by vocabulary_id, standard_concept
). For instance, this would show whether the CPT4 vocabulary was actually loaded (this is an additional step in the vocab loading process). But could also show other anomalies with the vocabulary loading.
Yes agree will add this. I now dump the vocabulary table but this is better, i can join that anyway.
Count of concepts per vocabulary by standard, classification and non-standard.
Done.