Closed gaurav closed 1 month ago
I dunno about this one. Are the entries in DrugBank really Drugs from our POV? They look like active ingredients (small molecules etc) to me.
I think you're right. This PR uses the DrugBank Open Vocabulary file, and most of the names are generic names like ibuprofen, Ibuprofen piconol, captopril, VTP-194204, Etanercept, Erythropoietin, WRR-99, Zofin, Krill Oil, MK-886, BMS-833923 and others. I figured it made sense to categorize all of these as drugs as a way of grouping everything from small molecules to protein hormones to organic substances that all have some sort of medical benefit. But if our criteria for "Drug" is a specific formulation (e.g. "acetaminophen 5mg capsule"), then yeah, these would not make sense. I'm not sure if we can uniformly say these are all small molecules, but I think most of them are, so I've reverted the type for DrugBank entries from biolink:Drug back to biolink:ChemicalEntity (d38ce21). I've also made a note for us to check for other small molecules/chemical entities that might have accidentally ended up in Drug.txt (#348).
Incidentally, in addition to the DrugBank ID, many of the 16,581 chemicals in the DrugBank download have a UNII, CAS or InChI Key. I don't think we can use those to categorize DrugBank entries better (or if we want to include those concords), but just wanted to mention that in case it's useful.
This PR adds DrugBank labels (from DrugBank v5.1.12). Somehow closes #332, but I'm not sure how (it might be a previous change in PR #279 that really closed this).
Should be merged after PR #279.