StructuralGenomicsConsortium / CNP4-Nsp13-C-terminus-B

An SGC Open Chemical Networks Project Devoted to a site on the SARS-CoV-2 protein nsp13
8 stars 0 forks source link

Top 100 Purchaseable Predictions #3

Closed mattodd closed 1 year ago

mattodd commented 2 years ago

Konstantin Popov (UNC Chapel Hill, @kipUNC) has carried out the first predictions of purchaseable compounds that should bind the C-terminus-B site. The PDF is here. Screenshot:

The molecule codes need tweaking (so they can be seen in total) then the list needs triaging and prioritising based on developability/accessibility, and maybe cost.

Which compounds should we buy and send to Masoud at Toronto?

mattodd commented 2 years ago

@kipUNC shared a pymol file, which is here. As he mentioned: "Please find that pymol session we were talking about. This session represents a sample of some top compound with their Glide poses docked into the site 3. Purple ligands are from enamine hit locator library, orange are your designed compounds and green are 3 fragalisys x-ray structures for the reference." Maybe take a look @lindapatio @bhzhx

TomkUCL commented 2 years ago

Not sure if useful, but this may be for comparing commercial libraries to compare fragment diversity amongst them?

Comprehensive analysis of commercial fragment libraries RSC Med. Chem., 2022, Advance Article https://pubs.rsc.org/en/content/articlelanding/2022/md/d1md00363a/unauth

TomkUCL commented 2 years ago

@mattodd @tmw20653 @H-agha

Here is the Enamine quote for the top 100 scoring compounds from the Enamine 40BN library from @kipUNC (cluster_centroids_100.dwar):

Copy of Quote_1517184_EUR 5258.xlsx

mattodd commented 2 years ago

Hi @TomkUCL - so there's a cluster up top of the sheet of available molecules. Most are unavailable (is that surprising?). These can be bought:

Cl.O=C(NC=1C=CC=CC1)C2CCN(CC2)C=3C=NC=CN3 Cl.O=C(CCNC(=O)C=1C=CC=NC1)NC=2C=CC=CC2 Cl.O=C(N1CCN(CC1)C=2C=CC=NC2)C3=CNC=4C=CC=CC34 Cl.O=C(NC=1C=CC=CC1)C2CCN(CC2)C=3C=CC=NC3 O=C(CNC(=O)CC=1C=CC=CC1)NC=2C=CC=C3N=CNC23 O=C(NC=1C=CC=2OCC(=O)NC2N1)C(=O)C3=CNC=4C=CC=CC34 Cl.O=C(CC=1C=CC=CC1)NNC(=O)C=2C=CC=NC2 O=C(NC=1C=CC=2CCC(=O)NC2C1)C(=O)C3=CNC=4C=CC=CC34 Cl.O=C(NC=1C=CC=CC1)C(=O)NC=2C=CC=NC2 O=C(NC=1C=CC=C(CN2C=NC=N2)C1)C(=O)C3=CNC=4C=CC=CC34 O=C(CC=1C=CC=CC1)NC=2C=CC=3CCCC(=O)NC3C2 O=C(C=CC=1C=CC=2OCC(=O)NC2C1)C3=CNC=4C=CC=CC34 O=C(NCC=1C=CC=C2OCC(=O)NC12)C(=O)C3=CNC=4C=CC=CC34 Cl.O=C(NCC=1C=CC=C2CN(CCC12)C(=O)[C@@H]3CCC=4C=CC=CC34)C=5C=CC=NC5

So, @kipUNC I guess we need to know whether these compounds remain of interest? Have they been superceded by any more recent predictions?

kipUNC commented 2 years ago

As previous list had smiles for the corresponding scaffolds, not actual compounds, please find an updated list. These may explain low % for synthetic availability for the smiles from the previous list. nsp13_40b_docking_top_500_not_scaffolds.txt

TomkUCL commented 2 years ago

As previous list had smiles for the corresponding scaffolds, not actual compounds, please find an updated list. These may explain low % for synthetic availability for the smiles from the previous list. nsp13_40b_docking_top_500_not_scaffolds.txt

Thanks Kyosta, would you like me to submit this list with Enamine in place of the previous list?

kipUNC commented 2 years ago

Yes please.

kipUNC commented 2 years ago

Here is top 28 generated molecules reinforced by the top hit from 40B screening. generated_top_28_fineTune40B.csv

TomkUCL commented 2 years ago

@kipUNC following today's meeting and @tmw20653 checking to clarify where the fragments have come from, I just want to check with you the number of purchasable compounds. Any that are non purchasable we will continue with synthesis here. Thanks.

Pharmacophore screen generated original top 40 scoring compounds (9 are purchasable from Enamine)

Generative Enamine REAL – top 100 (43 are purchasable from Enamine)

= 52 total are purchasable so far

Kyosta is currently running a reinforced biased screen to bump up the number of purchasable compounds.

Please correct if any mistakes here.

mattodd commented 2 years ago

Looks good @TomkUCL. Hopefully all correct @kipUNC. I'm updating the wider AViDD READDI group today and I've started to construct the "Story So Far" page on the wiki which is meant to be the high level summary of where we're up to.

To avoid any confusion in what we're talking about, can we call this list of compounds the RealDock1 list? We can increase the number each time we use the same methodology. So actually, the initial list was RealDock1 and the new one that @kipUNC is working on is RealDock2? The other lists are Generative1, Generative2 etc?

I think this Issue (number 3) should be reserved for the compounds that have arisen from docking the Enamine Real library - i.e. the docking of commercially-available molecules, not generative approaches. I guess we can close this Issue when we have bought and shipped the top hits.

kipUNC commented 2 years ago

Those numbers look reasonable. We've had some pretty good results with our last screening of the 40B REAL library and now I am finishing the list. So there will be the last list for the Site 3. I would not necessarily be keeping all purchasable hits separate lists as those were mostly due to developmental stage of the project. So when we get all the purchasable hits ready we can combine them. As for the generative compound lists I would keep two as we have two conceptually different approaches: first more simple compounds that use first-stage p-phore model biasing and the later one that uses information about the latest virtual hits from 40B REAL space.

mattodd commented 1 year ago

Closing this Issue for now, since this page is linked to the Story So Far page. It also describes the generation of lists of molecules that I believe were superseded by a more refined list.

We need the final list (the one that was ordered) cleaerly linked to the Story So Far page. @TomkUCL

We also need technical details of how the 40Bn Enamine docking was done, plus the method of refinement. @kipUNC