MolecularAI / REINVENT4

AI molecular design tool for de novo design, scaffold hopping, R-group replacement, linker design and molecule optimization.
Apache License 2.0
359 stars 89 forks source link

Addition of Deuterium to model vocab #145

Closed abazabaaa closed 3 weeks ago

abazabaaa commented 1 month ago

Hi,

Thanks for putting together the code and documentation. I’ve been enjoying getting into the code and exploring with it a bit.

Our group has some compounds that have CD3 or CD2 in them and some of our models rely on those features. As far as I am aware REINVENT doesn’t tolerate isotopes. I wanted to see if you had gone done this road at all and maybe had a few thoughts. We would be happy to give a shot and contribute to the project — but didn’t want to rush into something you already abandoned.

Thanks for your time, Tom

halx commented 1 month ago

Hi,

many thanks for your interest in REINVENT and welcome to the community!

I was wondering why your models depend on deuterated compounds.

In preparation of all prior models we strip out all isotopes. The vocabulary is fixed after training and cannot be extended. This means that you would have to train a new model keeping [2H].

Cheers, Hannes.

abazabaaa commented 1 month ago

Some of our models are trained on compounds containing deuterium. Metabolism can be influenced by the substitution of CH for CD. Without accounting for D it would be hard to use those models with it. My colleague will follow up with a question about how best to modify the code. I think she has some ideas but we want to make sure we aren’t breaking anything.