I've built out working classes for both disease and therapy normalization, but since disease is giving us trouble I wanted to put this PR up first. It makes some changes:
Add OncoTree, DO, NCIt source classes
Add a "custom" data source class. A user provides version and download methods, and they get the data storage, file selection, etc routines for free. I've used this to a) define some custom OMIM rules for how we do things in the disease normalizer (we could also write an internal tool to grab stuff from IGM annotations, I just wasn't sure about writing that out in an open repo) and b) manage the extra RxNorm file that we generate in therapy.
I've built out working classes for both disease and therapy normalization, but since disease is giving us trouble I wanted to put this PR up first. It makes some changes:
utils
moduleWIP disease implementation is here: https://github.com/cancervariants/disease-normalization/pull/163
Here's an example of the
CustomData
class for handling RxNorm drug forms: https://github.com/cancervariants/therapy-normalization/blob/d5794d6e9c83d3a4ad1dc2f56be166d5c63dcaf5/therapy/etl/rxnorm.py#L71