Open jvendetti opened 2 years ago
the only obvious difference I see is that in 2021AB TTL file terms do not have any "SIB" relationships unlike in previous versions.
MEDDRA]$ wc -l 16/MEDDRA.ttl
2594640 16/MEDDRA.ttl
MEDDRA]$ wc -l 19/MEDDRA.ttl
978517 19/MEDDRA.ttl
MEDDRA]$ grep SIB 16/MEDDRA.ttl | wc -l
1659383
MEDDRA]$ grep SIB 19/MEDDRA.ttl | wc -l
0
remove the SIB lines from the 16 version and you're within 4.4% (43K lines) of the 19 version. Pretty suspicious, allowing for natural growth.
Contacted by an end user that reported a significant decrease in the size of the MedDRA ontology source file between releases. I looked at the file size for all submissions, and noticed a steady increase through the 2020AB release, after which there's a significant drop:
I looked at the UMLS MDR statistics page for the 202AB release, which reports 71,603 lower level terms (which I assume are classes). The statistics page for the latest 2021AB release reports 73,991 lower level terms. There's nothing obvious in the UMLS documentation that would explain the file size decrease, considering that the number of terms went up.
The BioPortal REST API reports that we're serving 76,447 classes (https://data.bioontology.org/ontologies/MEDDRA/metrics).