Open petermr opened 5 years ago
<synonym>
child elements to dictionary <entry>
elements
Will start by creating a bag of unknown terms.
We need to sort compounds by WikidataID and PubchemCID to determine synonyms. Example:
para-cymen-7-ol 325 4-Isopropylbenzyl alcohol
p-cymen-7-ol p-cymen-7-ol 325 4-Isopropylbenzyl alcohol
These two entries relate to the same CID so should be grouped together. PMR will then decide which is the best to keep
cuminaldehyde cuminaldehyde cuminaldehyde Q419952 326 4-Isopropylbenzaldehyde
cuminal cuminal cuminaldehyde Q419952 326 4-Isopropylbenzaldehyde
octanal
has both Wikidata and Pubchem
@ambarishK will sort table in a spreadsheet on WikidataID column. notFoundWIKIDATASortedPubChem.tsv PMR will then edit this manually
@ambarishK will sort table in a spreadsheet on PubChemID column. notFoundWIKIDATAPubChemSorted.tsv PMR will then edit this manually
The recommitted files will normalize to a single reference for Wikidata and for Pubchem. PMR will then merge possible conflicts and fuzziness.
The compound names in table columns are frequently ambiguous. The first table is https://github.com/petermr/CEVOpen/blob/master/articleAnalysis/oil186/raw/thyme.tsv