StructuralGenomicsConsortium / CNP4-Nsp13-C-terminus-B

An SGC Open Chemical Networks Project Devoted to a site on the SARS-CoV-2 protein nsp13
8 stars 0 forks source link

Identifying Under-represented or NON represented building blocks #8

Open mattodd opened 2 years ago

mattodd commented 2 years ago

Further to the design conversations over at #4 this might be an interesting source of some potential targets? https://pubs.acs.org/doi/10.1021/acs.jmedchem.1c01139 @edwintse @TomkUCL @jemimahaque - see if it contains info about those simple building blocks that might be not yet known?

Or are there other studies highlighting complex low molecular weight compounds that are "missing" from commercial databases?

TomkUCL commented 2 years ago

sorry not quite sure what you mean, i.e. look for unknown derivatives of these building blocks that we might be able to make?

mattodd commented 2 years ago

I was thinking whether the article discussed "Fantastic Blocks and Where to Find Them" i.e. whether there were building blocks that would be advantageous but which ... do not yet exist (i.e. the kind of problem we're currently mulling over). Which potentially useful building blocks are conspicuous by their absence?

mattodd commented 2 years ago

OK @TomkUCL @edwintse so based on the idea we were discussing in lab:

New Tom Ed Idea

Steps are:

1) Are the resulting blocks (rightmost compound) not commercially available? Search via a simplest-search using a fragment structure like that shown. If simply available, kill it.

2) Is there any Scifinder precedent for the first step, i.e. the selectivity challenge.

3) Search for all isomeric variants in the central ring, i.e. 1 and 2 N's in different positions.

4) If looking good, let's get a starting material.

5) Let's ask UNC/Toronto folks to enumerate a library to see if anything that is accessible using this chemistry would be useful for nsp13.

mattodd commented 2 years ago

Start with carbonyl, see what get. Come to think of it, we could reverse the order of chemical steps here. Acylate then SNAr?

On Tue, 30 Nov 2021, 12:11 Tom Knight, @.***> wrote:

⚠ Caution: External sender

For step 1, do you mean the amide or the free aniline (without the blue carbonyl)?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/StructuralGenomicsConsortium/CNP4-Nsp13-C-terminus-B/issues/8#issuecomment-982577286, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBO2NITIYIYPK762TD56H3UOS5NZANCNFSM5ISADBSA . Triage notifications on the go with GitHub Mobile for iOS https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Cmatthew.todd%40ucl.ac.uk%7C8dde7648867c4c20efda08d9b3fa7ef8%7C1faf88fea9984c5b93c9210a11d9a5c2%7C0%7C0%7C637738710720606137%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=yc8r4UMkZY2qA26aJ%2F259G3DBlZRv7FNKMtpSRPdcOk%3D&reserved=0 or Android https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7Cmatthew.todd%40ucl.ac.uk%7C8dde7648867c4c20efda08d9b3fa7ef8%7C1faf88fea9984c5b93c9210a11d9a5c2%7C0%7C0%7C637738710720606137%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=PTw6k1%2B12yu%2FV2NOLTailUQW7%2FusXT%2FO5FraUUDfOUY%3D&reserved=0.

TomkUCL commented 2 years ago

That would probably help to prevent an self reaction/ polymerisation of the aniline (if it occurs).

@mattodd so by acylating first we're talking about having the carbonyl directly on the ring first? e.g. image

TomkUCL commented 2 years ago

5-amino-2-​pyrimidinecarboxylic acid is available from fluorochem for £198 / 1g, so we can rule that one out image

mattodd commented 2 years ago

No - still two N's attached to the ring. For my bottom right structure I mean you could acylate the ring N first (e.g. with R'OCl), then substitute the halogen.

mattodd commented 2 years ago

Another relevant and interesting paper - @edwintse @TomkUCL @jemimahaque please check it out and see whether there is anything we can learn from or use.

https://pubs.acs.org/doi/10.1021/acsmedchemlett.1c00340

TomkUCL commented 2 years ago

Right, I understand what you mean.

TomkUCL commented 2 years ago

Here's the link to the MW-assisted SNAr reaction paper previously discussed. https://pubs.acs.org/doi/abs/10.1021/jo8008758

TomkUCL commented 2 years ago

So pyrimidin-2-amine derivatives look pretty limited, particularly those with aliphatic chains on either side of the of ring:

image image image image image image image image image

TomkUCL commented 2 years ago

The 2,5-pyrazine building block looks quite limited in terms of commercial availability, which could be a good sign; image

Similarly, Enamine only make 4 substructures of the building block, which are quite specific. image

tert-​Butyl (5-​aminopyrazin-​2-​yl)​carbamate is available from fluorochem at £239 / 1g: image

TomkUCL commented 2 years ago

3-​pyridazinyl building blocks aren't especially prevalent either, but enamine have about 18 similar building blocks for this heterocycle image image

TomkUCL commented 2 years ago

Maybe useful after todays discussion for reaching different functionalised heterocycles : Use of sustainable organic transformations in the construction of heterocyclic scaffolds https://doi.org/10.1016/B978-0-12-817592-7.00009-5

TomkUCL commented 2 years ago

Further to the design conversations over at #4 this might be an interesting source of some potential targets? https://pubs.acs.org/doi/10.1021/acs.jmedchem.1c01139 @edwintse @TomkUCL @jemimahaque - see if it contains info about those simple building blocks that might be not yet known?

Or are there other studies highlighting complex low molecular weight compounds that are "missing" from commercial databases?

Below are points that might be worth discussing further - maybe worth investigating under- or non- represented by accessible boronic acids- or esters. Worth mentioning that industry take more interest with chemistry that they already regularly use (e.g., Suzuki cross-couplings):

_"Suzuki coupling with alkyl groups generally requires very specific reaction conditions, different functional handles can have limited substrate scope. In our experience, there are still a lot of hurdles for alkyl Suzuki coupling to be used routinely by medicinal chemists, let alone in a library format, as detailed in our recent publication. (15) This situation is further hampered by the commercial availability of these building blocks Ideally, if one could install both aryl and alkyl functionalities via the same Suzuki reaction condition within a parallel library, they will greatly improve the SAR efficiencies in medicinal chemistry research."

"About 25% of the boronic acids/esters used in our libraries originated from our corporate compound collection. These building blocks were either not known in the literature or lack reliable commercial sources. For example, many of the proprietary building blocks were designed as analogues of compound 11 (Figure 1).

"...Another example is cyclopropyl boronic acid/esters (compound 12, Figure 1)... This is also reflected in our data set where only less than five such building blocks (compounds 12, Figure 1) were used in library synthesis over the last decade. As such, novel boronic esters incorporating a cyclopropyl group were designed and made available to medicinal chemists internally. (17) The typical Suzuki library conditions used for aryl boronic acids/esters were also applicable for cyclopropyl boronic esters. This is advantageous, as one can obtain both aryl and cyclopropyl containing final compounds within one Suzuki library, a desirable feature for SAR studies."_

"The analysis of boronic acids/esters of underrepresented structures in medicinal chemistry research can be the first important step to fully garnish the power of Suzuki cross-coupling reactions, especially in parallel library format, to effectively sample the SAR space and potentially obtain IP advantages."

image

TomkUCL commented 2 years ago

Here's my take on this paper as we have previously discussed: Novel Reagent Space: Identifying Unorderable but Readily Synthesizable Building Blocks https://pubs.acs.org/action/showCitFormats?doi=10.1021/acsmedchemlett.1c00340&ref=pdf

Aim: To identify large numbers of drug discovery building blocks neither commercially available nor present in an internal inventory but that could be prepared with one chemical transformation from a readily available precursor. The authors hope to automate this process using a cheminformatics method with an awareness of all orderable reagents (from an internal inventory or commercially available) and knowledge of all other reagents that could be derived from them via one robust chemical transformation.

Step 1: Identify all orderable reagents, using both internal proprietary and also commercial building blocks; an easily overlooked but necessary component is the exclusion of non-orderable compounds. From the intermediates database, only those molecules reported to have at least 50 mg available were included. Commercial reagent structures were obtained from a selected set of trusted vendors and some specialty catalogues, requesting only available compounds, excluding compounds requiring synthesis upon ordering.

Step 2: Search through this curated building block set for a molecule conceptually similar to 4, identifying any structural isomers, but which aren't commercially available.

Step 3: Apply a mechanism to assess whether a desired building block can be readily synthesized. For this purpose, they used modules available within the ASKCOS v0.3.124 suite of retrosynthetic tools:

SCScore - SCScore is a single numerical estimation of a molecule’s synthetic complexity, not an assessment of any particular reaction or synthetic path. It has recently been used to analyse and improve the synthesizability of compounds proposed by generative models.

One-Step Retrosynthesis Fast Filter Score: - provides a likelihood that conditions exist for which the reactants will form the desired product.

One-Step Retrosynthesis Score. - an assessment of whether the specific forward reaction will proceed as expected.

The authors propose a preliminary rule-of-thumb:

..."Project design parameters required a neutral substituent at this position; using pKa predictions from Pipeline Pilot, we removed any structure with an acidic or basic moiety, leaving 223'163 candidate alcohols... we removed the 1437 alcohols in this set from GBD-13 that also corresponded to available compounds (see the Supporting Information). The remaining 221'726 alcohols were filtered by performing a One-Step Retrosynthesis with ASKCOS30 and removing compounds scoring less than −100, leaving 15'681 for further consideration."

image

...."Several thousand reagents predicted to be readily accessible by synthesis were eliminated because they contained chemical functionality not desired in the final LO molecule, although useful for other purposes (e.g., aldehydes); details of this filtering can be found in the Supporting Information.... The remaining 765 alcohols were virtually attached to a key template to allow the calculation of ADME-relevant properties. We excluded any reagent that led to a final compound with a cLogP32 of less than 2 or greater than 4 or with a topological polar surface area of less than 85 or greater than 125. From the remaining 338 alcohols, we used interactive cheminformatics tools to select 12 for synthesis. We next examined these 12 in the graphical web-based version of the ASKCOS retrosynthesis tool; as shown in Figure 3, the proposed reactions cover a variety of robust chemical transformations.

image

The ASCOS Software package for computer aided synthesis planning appears to be an open-source tool which is freely available, more info is available on the GitHub page: https://github.com/ASKCOS/ASKCOS https://askcos.mit.edu/

Further interpretation to come..

TomkUCL commented 2 years ago

@mattodd @edwintse @jemimahaque Whilst searching for alternative routes that avoid the Grignard/alkyl lithiation needed to reach the current target molecule, I came across a commercially unavailable building block with duel functionality, and which could potentially be made in one simple step (suzuki). The ester and N-protected analogue of this molecule are also unavailable, and this would give us more flexibility with our chemistry based around the pyrrole C2-carbonyl as the core motif, as described in Heba's initial pharmacophore.

Th pyrrole bromo acid is commercially available and we currently have this in stock since this is what we are currently using.

image

image

mattodd commented 2 years ago

@TomkUCL huh, that's interesting. V simple. What about the alkyne analog of that? Could imagine sequential amide couplings and clicks.

TomkUCL commented 2 years ago

@TomkUCL huh, that's interesting. V simple. What about the alkyne analog of that? Could imagine sequential amide couplings and clicks.

Enamine make the N-CH3 ester analogue, but no results on the free-N acid analogue image

mattodd commented 2 years ago

Oooh. Consider the NH version, and maybe as the ester. Could we do direct amidation in the ball mill then a click? If yes, then that block is looking pretty sweet and we should include it in a few trial structures for @kipUNC @lindapatio @H-agha to play with.

TomkUCL commented 2 years ago

Oooh. Consider the NH version, and maybe as the ester. Could we do direct amidation in the ball mill then a click? If yes, then that block is looking pretty sweet and we should include it in a few trial structures for @kipUNC @lindapatio @H-agha to play with.

Very nice idea if that could work. Jamie coauthored on the ball milling paper, so I'll ask him what he thinks the chances of that amidation reaction are of working, since I don't think they tried with pyrroles...

jemimahaque commented 2 years ago

Here's my take on this paper as we have previously discussed: Novel Reagent Space: Identifying Unorderable but Readily Synthesizable Building Blocks https://pubs.acs.org/action/showCitFormats?doi=10.1021/acsmedchemlett.1c00340&ref=pdf

Aim: To identify large numbers of drug discovery building blocks neither commercially available nor present in an internal inventory but that could be prepared with one chemical transformation from a readily available precursor. The authors hope to automate this process using a cheminformatics method with an awareness of all orderable reagents (from an internal inventory or commercially available) and knowledge of all other reagents that could be derived from them via one robust chemical transformation.

Step 1: Identify all orderable reagents, using both internal proprietary and also commercial building blocks; an easily overlooked but necessary component is the exclusion of non-orderable compounds. From the intermediates database, only those molecules reported to have at least 50 mg available were included. Commercial reagent structures were obtained from a selected set of trusted vendors and some specialty catalogues, requesting only available compounds, excluding compounds requiring synthesis upon ordering.

Step 2: Search through this curated building block set for a molecule conceptually similar to 4, identifying any structural isomers, but which aren't commercially available.

Step 3: Apply a mechanism to assess whether a desired building block can be readily synthesized. For this purpose, they used modules available within the ASKCOS v0.3.124 suite of retrosynthetic tools:

SCScore - SCScore is a single numerical estimation of a molecule’s synthetic complexity, not an assessment of any particular reaction or synthetic path. It has recently been used to analyse and improve the synthesizability of compounds proposed by generative models.

One-Step Retrosynthesis Fast Filter Score: - provides a likelihood that conditions exist for which the reactants will form the desired product.

One-Step Retrosynthesis Score. - an assessment of whether the specific forward reaction will proceed as expected.

The authors propose a preliminary rule-of-thumb:

  • A One-Step Retrosynthesis Score of −15 or higher indicates the compound can be prepared in a single robust chemical transformation from a readily available reagent.
  • Score of −100 or lower indicates an inaccessible compound.
  • A (relatively rare) Score between −15 and −100 requires further examination (additional discussion/ examples are in the Supporting Information).

..."Project design parameters required a neutral substituent at this position; using pKa predictions from Pipeline Pilot, we removed any structure with an acidic or basic moiety, leaving 223'163 candidate alcohols... we removed the 1437 alcohols in this set from GBD-13 that also corresponded to available compounds (see the Supporting Information). The remaining 221'726 alcohols were filtered by performing a One-Step Retrosynthesis with ASKCOS30 and removing compounds scoring less than −100, leaving 15'681 for further consideration."

image

...."Several thousand reagents predicted to be readily accessible by synthesis were eliminated because they contained chemical functionality not desired in the final LO molecule, although useful for other purposes (e.g., aldehydes); details of this filtering can be found in the Supporting Information.... The remaining 765 alcohols were virtually attached to a key template to allow the calculation of ADME-relevant properties. We excluded any reagent that led to a final compound with a cLogP32 of less than 2 or greater than 4 or with a topological polar surface area of less than 85 or greater than 125. From the remaining 338 alcohols, we used interactive cheminformatics tools to select 12 for synthesis. We next examined these 12 in the graphical web-based version of the ASKCOS retrosynthesis tool; as shown in Figure 3, the proposed reactions cover a variety of robust chemical transformations.

image

The ASCOS Software package for computer aided synthesis planning appears to be an open-source tool which is freely available, more info is available on the GitHub page: https://github.com/ASKCOS/ASKCOS https://askcos.mit.edu/

Further interpretation to come..

The nitrile moiety in some of those building blocks may be useful to make various 5-membered heterocycles: image

Also, further functionalization of the nitrile to amine (with protection of the OH) could allow access to urea derivative structures using isocyanates: image

Tom has suggested replacing the CN group in those building blocks with sulfonyl chlorides (which will be useful to make sulfonamides) or isocyanates - the bifunctional building blocks shown below are novel and not available to buy on Enamine:

image

TomkUCL commented 2 years ago

@jemimahaque nice work, thanks Jemima! We can try plugging those structures into ASCOS retrosynthesis programme and see what routes it suggests, then discuss from there. The paper discussed making the nitrile analogues from commercially available building blocks in one step, so hopefully we can do something similar for these unavailable compounds.

mattodd commented 2 years ago

Also when considering building blocks that are decorated with things (i.e. pendant things that can be displayed for binding interactions, but which we're not necessarily using to attach anything else) this review of FG's could be looked at.

TomkUCL commented 2 years ago

These papers may be useful for the purposes of identifying under- & non-represented building blocks, including core motifs (e.g. rings) as well as decorative scaffolds - further discussion to follow here.

Here also are links to Software from the Molecular AI department at AstraZeneca R&D (https://github.com/MolecularAI), including a tool for retrosynthetic planning (https://github.com/MolecularAI/aizynthfinder) , graph neural networks for molecular design (https://github.com/MolecularAI/GraphINVENT) and Molecular optimization (https://github.com/MolecularAI/deep-molecular-optimization), which may become useful for our own molecular design on the nsp13 project.

1. Discovery of Novel BRD4 Ligand Scaffolds by Automated Navigation of the Fragment Chemical Space

https://doi.org/10.1021/acs.jmedchem.1c01108

2. Magic Rings: Navigation in the Ring Chemical Space Guided by the Bioactive Rings

https://pubs.acs.org/doi/full/10.1021/acs.jcim.1c00761 Random selection of rings covered by the analysis: https://pubs.acs.org/na101/home/literatum/publisher/achs/journals/content/jcisd8/0/jcisd8.ahead-of-print/acs.jcim.1c00761/20210826/images/large/ci1c00761_0001.jpeg

3. LibINVENT: Reaction-based Generative Scaffold Decoration for in Silico Library Design

https://pubs.acs.org/doi/full/10.1021/acs.jcim.1c00469

Further discussion:

2) Magic Rings: Navigation in the Ring Chemical Space Guided by the Bioactive Rings

GOAL: The goal of the study was to find out how the bioactive rings are distributed in the chemical space and whether it is possible to identify some simple substructure features that are typical for bioactive rings and separate them from the common rings. Results of such analysis would be helpful as a guidance in navigating the ring chemical space and in searching for novel bioactive ring systems.

This graph (https://bit.ly/magicrings) shows that the bioactive rings are distributed throughout the whole chemical space; however, not uniformly; there are regions more densely populated, as well as empty spaces, mostly on the edges of the graph. These border regions are occupied by less explored, more exotic ring systems with not so common structural features. One can see also that many active rings are grouped together into small clusters belonging to the same target class.

https://bit.ly/magicrings shows a principal component analysis plot of 39'361 rings. Points representing the bioactive rings are marked by colour. The x-axis represents approximately the size of the rings (small on the left, large on the right), and the y-axis represents complexity or feature richness. The bottom of the plot contains less complex molecules, and the top, more complex.

Separation of Active and Inactive Rings. ..."The results of this analysis may be seen in Figure 5, where the neural network model was applied to the whole set of nearly 40 000 rings and the results were displayed on the 2D visualization of chemical space generated previously. On this plot one can see several bioactive areas containing high density of active rings... These bioactive areas contain, of course, also many rings without reported biological activity and are the most promising places to look for novel interesting rings. These results are similar to those obtained by our earlier analysis of the chemical space of aromatic rings,4 where one could also see that the active systems were grouped together into several clusters, surrounded by unexplored areas."

Web Tool to Navigate the Ring Chemical Space. "By using this tool, the users are able to analyse the entire chemical space of 40 000 rings or to focus on some of its more interesting regions. The rings of interest may be selected, their structure depictions displayed, and their SMILES codes downloaded. The screenshots illustrating different ways of using the web interface are shown in Figure 6. https://pubs.acs.org/doi/10.1021/acs.jcim.1c00761?fig=fig6&ref=pdf .More detailed information about this web tool is available directly online in the tool Help page.

3) LibINVENT: Reaction-based Generative Scaffold Decoration for in Silico Library Design

GOAL: LibINVENT is a novel tool for de novo drug design capable of rapidly proposing chemical libraries of compounds sharing the same core while maximizing a range of desirable properties. The shared core ensures that the compounds in the library are similar, possess desirable properties, and can also be synthesized under the same or similar conditions. The LibINVENT code is freely available in a public repository at https://github. com/MolecularAI/Lib-INVENT. The code necessary for data preprocessing is further available at: https://github.com/MolecularAI/Lib-INVENT-dataset.