Closed mattodd closed 8 years ago
Notice guys that if strings are missing for OSM-S-80 through 90 then you might find them here http://malaria.ourexperiment.org/osm_procedures/5402/OSM_Compound_List.html - referring to #318
Ok! It will be done by the end of today.
On 28 September 2015 at 23:04, Mat Todd notifications@github.com wrote:
Notice guys that if strings are missing for OSM-S-80 through 90 then you might find them here http://malaria.ourexperiment.org/osm_procedures/5402/OSM_Compound_List.html
- referring to #318 https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/318
— Reply to this email directly or view it on GitHub https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/347#issuecomment-143738146 .
I can't find the OSM-A series on the link you've provided. Would you be able to link me to the page with the A-series of compounds?
I only could find lnChi for OSM-S-80 to 91 from the link you gave me
On Friday, 2 October 2015, jhon4903 notifications@github.com wrote:
I can't find the OSM-A series on the link you've provided. Would you be able to link me to the page with the A-series of compounds?
— Reply to this email directly or view it on GitHub https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/347#issuecomment-144909120 .
Sorry @jhon4903 the structures are here. Are you able to derive the strings from this, or is that not possible?
And @minkyungchong can you derive the other strings from the InChI, or is that not possible/easy? i.e. is there a way to convert the InChI into a structure from which you can get the SMILES and InChIKey? No problem at all if not, just wondering.
@minkyungchong @mattodd What chemical drawing package are you using, there usually a "Paste Ad" option and you just choose InChi. Alternatively you can use the Chemical identifier resolver to convert http://cactus.nci.nih.gov/chemical/structure
Ok I'll try that!
On Tuesday, 6 October 2015, Chris Swain notifications@github.com wrote:
@minkyungchong https://github.com/minkyungchong @mattodd https://github.com/mattodd What chemical drawing package are you using, there usually a "Paste Ad" option and you just choose InChi. Alternatively you can use the Chemical identifier resolver to convert http://cactus.nci.nih.gov/chemical/structure
— Reply to this email directly or view it on GitHub https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/347#issuecomment-145818598 .
I'm not able to open up the .cdxml file as my program only allows for the importing of .cdx files
@mattodd @drc007 Sorry for late submission, but everything has been added! The website was very helpful. Just checking if CCN=C([C@@H]1CC@HN)O is a valid SIMILES?
CCN=C([C@@H https://github.com/H]1CC@HN)O is not a valid SMILES string
On 10 October 2015 at 00:55, minkyungchong notifications@github.com wrote:
@mattodd https://github.com/mattodd @drc007 https://github.com/drc007 Sorry for late submission, but everything has been added! The website was very helpful. Just checking if CCN=C([C@@H https://github.com/H]1CC@HN)O is a valid SIMILES?
— Reply to this email directly or view it on GitHub https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/347#issuecomment-147013432 .
Willem P van Hoorn, PhD Head of Chemoinformatics ex scientia ltd
email: wvanhoorn@exscientia.co.uk web: http://www.exscientia.co.uk phone: +44 1382 346655
Yes, there are some formatting things in the above versions, but in my text version of these CCN=C([C@@H]1CC@HN)O I get an error in Chemdraw when I try to paste it. Which molecule are you trying to do?
Thanks @jhon4903 - don't worry about it, it's a little fiddly so I just took care of those 4 compounds. Thank you for doing all the others, that's great.
@mattodd OSM-S-89 was the one
Did someone fix this? The entry for OSM-S-89 is now CCN=C([C@@H]1CC@HN)O which looks OK to me, even though stereochemistry has been assumed when I paste into Chemdraw.
If everyone is happy I might close this issue and generate new ones for anything remaining (we will have to add in some biological data for all these compounds). Thanks again @minkyungchong and @jhon4903 as well as @drc007 and @wvanhoorn for advice.
Unfortunately 'CCN=C([C@@H https://github.com/H]1CC@HN)O' is not a valid SMILES string, both MarvinSketch and Pipeline Pilot can't render it into a structure. I don't have access to Chemdraw but I would assume it makes some on the fly fix when the SMILES are not quite right. If Chemdraw has interpreted the SMILES correctly, i.e. the structure is what it is supposed to be, is there an export or 'save as' option to get the SMILES out again? Hopefully these would be the interpreted (corrected) SMILES. Alternatively, if someone could provide a link where the correct structure is shown I can generate the SMILES.
On 13 October 2015 at 11:28, Mat Todd notifications@github.com wrote:
Did someone fix this? The entry for OSM-S-89 is now CCN=C([C@@H https://github.com/H]1CC@HN)O which looks OK to me, even though stereochemistry has been assumed when I paste into Chemdraw.
If everyone is happy I might close this issue and generate new ones for anything remaining (we will have to add in some biological data for all these compounds). Thanks again @minkyungchong https://github.com/minkyungchong and @jhon4903 https://github.com/jhon4903 as well as @drc007 https://github.com/drc007 and @wvanhoorn https://github.com/wvanhoorn for advice.
— Reply to this email directly or view it on GitHub https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/347#issuecomment-147675490 .
Willem P van Hoorn, PhD Head of Chemoinformatics ex scientia ltd
email: wvanhoorn@exscientia.co.uk web: http://www.exscientia.co.uk phone: +44 1382 346655
I think we may be suffering a Github interpretation of what we're pasting here, which is not coming out right. If you go to the sheet and take the SMILEs from there for OSM-S-89 it works fine (for me).
Aha, that makes a difference: 'CCN=C([C@@H]1CC@HN)O' is a valid string
On 13 October 2015 at 11:45, Mat Todd notifications@github.com wrote:
I think we may be suffering a Github interpretation of what we're pasting here, which is not coming out right. If you go to the sheet http://tinyurl.com/OSM-Compounds and take the SMILEs from there for OSM-S-89 it works fine (for me).
— Reply to this email directly or view it on GitHub https://github.com/OpenSourceMalaria/OSM_To_Do_List/issues/347#issuecomment-147679360 .
Willem P van Hoorn, PhD Head of Chemoinformatics ex scientia ltd
email: wvanhoorn@exscientia.co.uk web: http://www.exscientia.co.uk phone: +44 1382 346655
I added a page on the malaria website:
http://www.cheminfo.org/flavor/malaria/Utilities/SMILES_generator___checker.html http://www.cheminfo.org/flavor/malaria/Utilities/SMILES_generator___checker.html
It allows to generate a SMILES code as well as parse a list of SMILES and generate the structure.
I will add the InCHI generation as well
We need to complete the dataset in the Master Sheet for OSM Series 1 compounds before we submit the relevant paper in the near future. The dataset will allow readers of the paper to browse the series interactively using @lpatiny ’s platform.
The compounds in Series 1 highlighted in red in the Master Sheet are those that are mentioned in the paper but which are not yet properly in the Sheet. The first thing is that we need the strings added in for those compounds. There are 56 of them. Could we split this up like this:
@minkyungchong : Any OSM-S compounds up to #200 @jhon4903 : Any OSM-S numbers with # greater than 200 as well as the OSM-E, OSM-A and the OSM-L compounds
Is that OK?
Would you be able add in the SMILES, InChI and InChiKeys for these compounds, plus any MMV codes you happen to find? The relevant compounds can all be found by searching in the Experimental Procedures ELN - use the menu on the right. Lots of the strings are already there too for copying and pasting.
(Series 1 compounds highlighted in the master sheet in green are mentioned in the paper and there are already data in the sheet so it looks like those are OK. Compounds that are not colour-highlighted are not explicitly mentioned in the paper for whatever reason and are a lower priority.)
Once all the informatics data are added, we can add in the biological data (potency and everything else), which should be fairly quick given that all the data are included in the draft paper, and we can work off that. But let’s manage that separately.
Let me know below if there are any problems or if you’ve any questions. Or (of course) if you’ve no current bandwidth and I ought to find someone else! Otherwise, thank you guys.