Open TomkUCL opened 7 months ago
The process so far:
We are interested in retaining these cores whilst moving away from the N-oxide parent-compounds since the 51 purchased Enamine N-oxide set afforded no hits by ATPase/SPR assay at UNC. Therefore, we are diversifying the functional groups on either end of the diamine cores for docking.
Following our recent meeting concerning virtual libraries for the UCL cores, Peter Brown (@toluene44) went back and picked 316 diverse acids available from Enamine Building Blocks and created 10 libraries of about 93,000 compounds each. The link to download the folder is below. https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.dropbox.com%2Fscl%2Ffo%2Flrf09mdh3qoe5k2006okp%2Fh%3Frlkey%3Dvwp7vounwat5h2l5zqmczdjf9%26dl%3D0&data=05%7C01%7Cthomas.knight.21%40ucl.ac.uk%7C389c7b54639143f4427008dbe07af024%7C1faf88fea9984c5b93c9210a11d9a5c2%7C0%7C0%7C638350590860482586%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EUovSK4kxi86vPz5jjourKjKNnZA8I94KN8Z8LWdw0E%3D&reserved=0
The 10 rigid (non-linear alkyl) cores arose from the top 100 GLIDE scoring compounds from the de novo generated set, shown here:
I've filtered these compounds in DataWarrior using Lipinski filters and by removing 'nasty or toxic' functional groups to favour promising drug-like molecules as starting points. This process reduced the number of molecules from >93'000 per core to ~20'000 per core.
Next, I chose 1000 compounds based on 'structural diversity' such that compound 2 is more diverse than 1, and 3 is more diverse than 1 & 2, etc.
These 1000 'most diverse' compounds were then subjected to a virtual screen using the open-source virtual screening tool PyRx 0.8 with the AutoDock Vina 1.2 scoring function. References: Autodock Vina (https://vina.scripps.edu/, https://doi.org/10.1021/acs.jcim.1c00203, https://doi.org/10.1002/jcc.21334, https://doi.org/10.1021/acs.jcim.8b00545). PyRx 0.8 (https://pyrx.sourceforge.io/ , Small-Molecule Library Screening by Docking with PyRx. Dallakyan S, Olson AJ. Methods Mol Biol. 2015;1263:243-50., https://www.nature.com/articles/s41598-021-83626-x , https://www.nature.com/articles/s41598-020-60221-0 , https://doi.org/10.1016/B978-0-12-822312-3.00019-9)
The helicase protein was prepared using Dock Prep in UCSF Chimera 1.6 and the ATP and ssRNA were removed from the structure to make these sites available for binding.
Exhaustiveness was run at 8 conformers per compound, then the compounds were ranked by their average binding energy. The poses were then visually inspected and validated using Biovia Discovery Studio 2021.
I ran this process for Core 6 (@qxsml / Andy's) core since the route to the mono-Boc protected diamine core seems to be relatively simple. Unfortunately the screen crashed after compound 726/1000, so I have only included these results.
The datawarrior file can be found here: https://drive.google.com/file/d/11W11d40R7SAbVLwOzM_I_ZLTwUvdJ_fz/view?usp=drive_link
The Excel file for each binding pose and teh associated binding energy can be found here: https://docs.google.com/spreadsheets/d/1ki3RNRQ3Q8lTqnXZJl03GibxwsCvG8y2/edit?usp=drive_link&ouid=117232601769274897551&rtpof=true&sd=true
https://drive.google.com/file/d/1ZEU3J7N_IIH_z02WHENOavddIVRqODHA/view?usp=drive_link
Summary of results for PyRx/Vina virtual screen for Core 1 Enamine carboxylic acids:
I have redocked the same Core1 library but this time to the close/engaged form of the protein as part of the replicase-complex (pdb 7kro).
This is Geoff Well's extended protein model used for MD simulations (i.e. not truncated at the C-terminus). Geoff has agreed to run a few hundred nano-second simulations of the top-scoring molecules to prioritise them for purchasing.
Raw results and average binding affinities: Core1 PyRx 0.8 Vina virtual screen results.xlsx Core1 PyRx 0.8 Vina virtual screen results - average binding energies .xlsx
I have taken the average binding energy for each pose. I set the cut-off at -10 kcal/mol and then visually inspected the poses to validate them where green = valid pose, red = invalid pose (cis-amide geometry).
Here is my completed analysis of the Core1 Enamine carboxylic acids enumerated library, including the docking of compounds, plans for synthesis and starting materials needed:
Core 1 enumerated library virtual screen analysis slides
In summary, I have prioritised 15 compounds for reagent purchasing based on virtual screen results (PyRx/ Vina) against two extreme nsp13 conformers, 7kro and 7rdx.
These compounds have gone to Geoff for MD analysis. All thoughts are welcome.
Why wait for MD results—just make those 15!
Peter
[Image] http://www.target2035.net/ Peter J. Brown, Ph.D., CChem, MRSC | Chemical Probes Structural Genomics Consortium @.**@.> @.**@.> | www.thesgc.orghttp://www.thesgc.org/ [twitter icon]https://twitter.com/thesgconline [youtube icon] https://www.youtube.com/channel/UCpl3xd4P7aYedOg6uw53hpg [linkedin icon] https://www.linkedin.com/company/structural-genomics-consortium-sgc-/mycompany/
“Target 2035 will create the pharmacological tools needed to study the entire proteome”. [PMID: 31278990https://pubmed.ncbi.nlm.nih.gov/31278990/]
From: Tom Knight @.> Date: Wednesday, December 13, 2023 at 8:41 AM To: StructuralGenomicsConsortium/CNP4-Nsp13-C-terminus-B @.> Cc: Brown, Peter J @.>, Mention @.> Subject: Re: [StructuralGenomicsConsortium/CNP4-Nsp13-C-terminus-B] PyRx 0.8 / AutoDock Vina Virtual Screen of de novo generated core compounds (non-N-oxides) (Issue #43)
Here is my complete analysis of the Core1 Enamine carboxylic acids enumerated library, including the docking of compounds, plans for synthesis and starting materials needed:
Presentation 2.pdfhttps://github.com/StructuralGenomicsConsortium/CNP4-Nsp13-C-terminus-B/files/13661277/Presentation.2.pdf
These compounds have gone to Geoff for MD analysis. All thoughts are welcome.
— Reply to this email directly, view it on GitHubhttps://github.com/StructuralGenomicsConsortium/CNP4-Nsp13-C-terminus-B/issues/43#issuecomment-1853940336, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A2B2Z7IKXHVTCIZC7U6K4R3YJGWATAVCNFSM6AAAAAA75SUYQWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJTHE2DAMZTGY. You are receiving this because you were mentioned.Message ID: @.***>
@toluene44 I will order the starting materials today. I will see where these compounds score in @kipUNC Glide screen and hopefully there will be some overlap.
Could you provide a brief description below of how you created your enumerated libraries for open science-ness purposes, please?
Here I have described the current strategy regarding the retained diamine cores from the de-novo generated compounds since deprioritising the N-oxide series from which these cores originated.
Peter Brown enumerated these diamine cores using purchasable carboxylic acids from Enamine, giving ~93'000 compounds per core; @kipUNC has received this for virtual screening using Maestro GLIDE.
Simultaneously, I have been filtering these compounds based on their predicted physiochemical properties using Data Warrior https://doi.org/10.1021/ci500588j and then running a smaller-scale virtual screen (~1000 compounds/day) using open-source software PyRx 0.8 https://sourceforge.net/projects/pyrx/files/0.8/ , which uses the Autodock Vina scoring function to calculate free-binding energies, to prioritise for purchasing / synthesis. This process is described in more depth below.
A key question for prioritising compounds based on in-silico methods for the helicase which conformation of nsp13 should be used for docking?
Based on the 2022 paper from D Shaw's team "Ensemble cryo-EM reveals conformational states of the nsp13 helicase in the SARS-CoV-2 helicase replication–transcription complex" https://www.nature.com/articles/s41594-022-00734-6 . The four conformational states of the nsp13–RTC are:
These states are thought to regulate the RNA synthesis and proofreading of the SARS-CoV-2 virus.