Open PhillipsOwen opened 3 years ago
It will be easier to track down with more information: For a few failing chemicals, what are the inchikeys (as above), but also, what is the chemical name? It would be especially good if some of these were simple things like water or fructose or glutamate, which I expect should appear in the data.
Oh, I think the problem is that the identifier above is not an inchikey. It's an inchi. An inchikey is something more like: AXVVJQZAKNNTPN-UHFFFAOYSA-N
I think foodb needs to be looked at. It's using ChemicalSubstance and incorrect identifiers. We might need to work on babel first to get the id's for foods to work better.
The latest FooDB database has IUPAC, INCHIKEY, INCHI and SMILES equivalent identifiers for chemical compounds. there are no equivalent identifiers for nutrients.
sadly, previous versions of the data had better equivalent identifiers.
the INCHIKEY is currently used by the parser to normalize the nodes and has a 0% success rate. look into using another equivalent identifier in the FooDB data that is more successful.
an example failure is: INCHIKEY:1S/C15H24/c1-11-7-8-13-12(2)6-5-9-15(3,4)14(13)10-11/h6,10,13-14H,5,7-9H2,1-4H3