Closed shabiel closed 4 years ago
Thanks @shabiel. I added a note to the wiki to reinforce this practice: https://github.com/synthetichealth/synthea/wiki/Generic-Module-Framework:-Basics#rxnorm-codes
Thank you.
Here's another one: 316049. That's Hydrochlorothiazide 25mg as a component. It can be part of 20-30 prescribable drugs.
I am not sure what I want to do about it yet. I can fix them again, but I want a long term solution. Something like a pull request reject if the RxNorm is not a valid RxNorm for a prescribable drug. Let me think about it.
GitHub does support pull request templates, so it's possible to add a checklist of requirements for the contributor and reviewer to verify on each pull request. As the synthea community continues to grow I think it would be a good idea to define some expectations that new contributions should adhere to. It's beneficial for both potential contributors to know in advance what the objective grading criteria are, as well as helpful for reviewers so they don't forget to check something potentially important.
I am thinking of an automated way: a web service call on each pull request to check if the RxNorm has NDCs. I haven't validated these yet, but something like this: https://rxnav.nlm.nih.gov/RxNormAPIs.html#uLink=RxNorm_REST_getAllHistoricalNDCs and https://rxnav.nlm.nih.gov/RxNormAPIs.html#uLink=RxNorm_REST_getAllNDCs.
316052: HCTZ 6.25mg does not exist on the market (and as far as I know, it never did, at least for humans. Maybe cats or dogs get to have it!).
I will work over the next three days to get these fixed, so we can close the issue.
I started on this. I wrote a script to extract all RxNorm codes from Synthea and analyze them. Here is the script and the preliminary results:
#!/bin/bash
file1=/tmp/syn.rxnorm.all
file2=/tmp/syn.rxnorm.sorted.uniq
rm -f $file1 $file2
# Grab all the RxNorm codes in all the models
echo "Extracting all RxNorm codes..."
for model in $(find src/main/resources/modules -type f); do
jq -rc '.. | .codes?[]? | select(.system == "RxNorm").code' < $model >> $file1
done
# Deduplicate the codes, so we only have unique codes
echo "Deduplicating..."
cat $file1 | sort -n | uniq > $file2
#cat $file2
#exit 0
echo "Calling RxNorm API to check NDCs"
echo "CODE TYPE #NDCS" | column -tx
while read -r code; do
ndcsLength=$(curl -s https://rxnav.nlm.nih.gov/REST/rxcui/$code/ndcs.json | jq -c '.ndcGroup.ndcList.ndc | length')
tty=$(curl -s https://rxnav.nlm.nih.gov/REST/rxcui/$code/property.json?propName=tty | jq -r '.propConceptGroup.propConcept[0].propValue')
output="$code $tty $ndcsLength"
echo $output | column -tx
done < $file2
Result:
CODE TYPE #NDCS
480 IN 0
4337 IN 0
10324 IN 0
38409 IN 0
56795 IN 0
72965 IN 0
73032 IN 0
84857 IN 0
105078 SCD 0
105586 SCD 0
106258 SCD 372
106892 SBD 5
141918 SCD 0
197319 SCD 163
197378 SCD 0
198014 SCD 525
198031 SCD 47
198240 SCD 40
198405 SCD 1
199224 SCD 68
200064 SCD 39
200243 SCD 16
200252 null 0
205532 SBD 2
205923 SBD 2
210856 SBD 0
235389 MIN 0
238100 SCD 16
243670 SCD 2
258494 IN 0
282464 SCD 0
308182 SCD 164
308192 SCD 21
308971 SBD 1
309043 SCD 3
309045 SCD 30
309097 SCD 70
309845 SCD 6
310261 SCD 12
310325 SCD 106
310436 SCD 33
310965 SCD 1468
311372 SCD 640
311700 SCD 60
311989 SCD 8
311995 SCD 68
312617 SCD 298
313002 SCD 46
313185 SCD 0
313782 SCD 346
314659 PIN 0
315971 SCDC 0
316049 SCDC 0
316672 SCDC 0
328670 SCDC 0
389221 SCD 0
406022 SCDC 0
429503 SCD 38
477045 SCD 1
483438 SCD 57
542347 SCD 20
562251 SCD 15
563026 SBDC 0
567645 SBDC 0
583214 SCD 0
596926 SCD 147
597195 SCD 75
608139 SCD 30
665078 SCD 21
672149 SCD 0
727762 SCD 9
745679 SCD 0
746030 SBD 1
748856 BPCK 3
748879 BPCK 6
748962 BPCK 4
749762 BPCK 2
749785 BPCK 8
749882 null 0
751905 BPCK 5
752899 SCD 0
757594 BPCK 1
789980 SCD 18
807283 SBD 3
831533 BPCK 4
833137 SBDC 0
834061 SCD 124
834102 SCD 223
835900 SBDC 0
849574 SCD 668
856980 SCD 36
857005 SCD 339
858069 SBDC 0
860975 SCD 276
861467 SCD 17
864718 SCD 47
865098 SBD 6
866414 SBD 18
895994 SCD 0
896209 SCD 3
897122 SCD 0
904419 SCD 25
966222 SCD 86
978950 BPCK 6
993452 SCD 0
996740 SCD 7
997223 SCD 212
997488 SCD 34
997501 SCD 121
998582 SBDC 0
998755 SCDC 0
999969 SBDC 0
1000126 SCD 24
1000156 SCD 0
1014676 SCD 61
1014678 SCD 618
1043400 SCD 302
1049221 SCD 308
1049630 SCD 484
1049636 SBDC 0
1049683 SCD 53
1091392 SCD 47
1094107 SCD 148
1114085 SCD 8
1153378 SCDG 0
1160499 SCDG 0
1234995 SCD 58
1310197 SBDC 0
1359133 BPCK 2
1363309 SCD 147
1366343 null 0
1367439 SBD 8
1373463 SCD 0
1534809 SCD 0
1599803 SCD 2
1601380 SCD 0
1605257 SBD 2
1650142 SCD 34
1652673 SCD 15
1655927 MIN 0
1656318 SCD 13
1658084 SCD 0
1659149 SCD 41
1719286 SCD 27
1732136 SCD 2
1732186 SCD 14
1734340 SCD 0
1734919 SCD 14
1736776 SCD 32
1736854 SCD 1
1737449 SCD 9
1740467 SCD 76
1790099 SCD 7
1791701 SCD 14
1803932 SCD 7
1808217 SCD 12
1856546 SBD 3
1860154 SCD 1
1860480 SCD 12
1870230 SCD 13
1873983 SCD 0
1940648 SCD 0
1946831 BN 0
2001499 SCD 1
2119714 SCD 0
I am very pleased with this. You can already tell where the problems are! I will refine the script as I learn more.
Please consider submitting a pull request to add your refined script to a synthea/scripts/
folder.
I raised this issue before, and made a pull request to clean up the existing meds, but I see it crept up again. I will put it here in github: When picking an RxNorm code, you must use something that has an NDC on the market, now or in the past.
E.g.: 1153378 is Clonazepam Oral. That is not something a patient can take. It has to be Clonzapeam 0.5mg oral tablet (197527) as an example.