Closed cdsouthan closed 8 years ago
There are Keys on the page, but not yet for every compound. Some ought to be captured..? On 28 Mar 2015 09:48, "cdsouthan" notifications@github.com wrote:
I note that while this page below convverts well in chemicalize.org there are no InChIKeys
http://openwetware.org/wiki/OpenSourceMalaria:Triazolopyrazine_%28TP%29_Series#Strings_for_Google
As you know these the Keys the most effective way for your structures to become Google findable so its a good idea to add them. Otherwise you have the odd situation that leads like MMV670437 PMIWBIXSAYKRGF-SFHVURJKSA-N can be found but not on the OSM site
— Reply to this email directly or view it on GitHub https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/289.
If you look at the source code it is a HTML page which is great for viewing in a web browser but I wonder if the html tags are messing up interpretation?
Chris/Matt, there are two related issues here. As you can test, chemicalize.org works fine for IUPAC, SMILES or InChI strings on (any) html pages. But AWK it won't "convert" InChIKeys (it could do a look-up for ones it has, but thats not configured). As you can see (image attached) Google indexes the InChIKey just fine in any OSM instanciation. This solves the "findability" problem but only as exact (or inner layer) matches. Consequent to my encouragment, ChemAxon actually deposited their 0.3 milion chemicalize.org structure conversion cache in PubChem in 2012 https://www.ncbi.nlm.nih.gov/pccompound?term=%22chemicalize.org%20by%20ChemAxon%22[SourceName]&cmd=DetailsSearch The big advantage (potentialy) for OSM, would be a quicker route for getting all structures to > PubChem (simply via chemicalization of the web pages) which them become globaly findable (in PubChem) by similarity as well as just exact match. However, ChemAxon did not prioritised the updates, so you can only "find" newer ones (i.e. from 2013 onwards) if you exucute a similarity search on chemicalize.org in situ (i.e. against their updated local cache of structures)
Just wanting to either progress or close this. We continue to use Keys, correct, but is there some problem with including them as text on a wiki? i.e. is this a problem our end or specifically with Chemicalise? Is this issue critical (actively introducing confusion) or rather "would like to have"?
Since data now being added to Google Master sheet, I'm closing this and we can re-engage with the newer ways in which we are making the strings public.
I note that while this page below convverts well in chemicalize.org there are no InChIKeys
http://openwetware.org/wiki/OpenSourceMalaria:Triazolopyrazine_%28TP%29_Series#Strings_for_Google
As you know these the Keys the most effective way for your structures to become Google findable so its a good idea to add them. Otherwise you have the odd situation that leads like MMV670437 PMIWBIXSAYKRGF-SFHVURJKSA-N can be found but not on the OSM site