Closed jhourani closed 10 years ago
Synonyms can now be parsed from most datasets. It would be very easy to add them for the other datasets, once we knew where to pull them from. The synonyms version of the code can be found on the synonyms
branch. Simply run gp_baseline.py
and a synonyms.json
file will be written to the specified directory.
synonyms.json output not written
synonym_dict only contains MeSH and ChEBI synonyms (+ empty dictionaries for entrez, mgi, hgnc, and swissprot). Need synonyms for each namespace as relevant.
Synonyms are now parsed from all applicable data sets, but not output as part of gp_baseline
For each namespace-generating data set: get_synonym_names() and get_synonym_symbols() returns a synonym dictionary (or None) as applicable
A set of synonyms is now obtained for a given id with get_alt_names(id) and/or get_alt_symbols(id). This is used for the output of rdf.py, for all namespaces, as applicable.
We would like to be able to get back a term and all of its synonyms for each dataset. This has a use in SDP-Web also.