MolecularAI / REINVENT4

AI molecular design tool for de novo design, scaffold hopping, R-group replacement, linker design and molecule optimization.
Apache License 2.0
359 stars 89 forks source link

LinkInvent AssertionError #102

Closed llvllahsa closed 4 months ago

llvllahsa commented 4 months ago

Hi,

When running Linkinvent with the attached input smiles and configuration, I receive this error:

Traceback (most recent call last):
  File ".../REINVENT4/env/lib/python3.10/site-packages/reinvent/runmodes/RL/run_staged_learning.py", line 377, in run_staged_learning
    terminate = optimize(package.terminator)
  File ".../REINVENT4/env/lib/python3.10/site-packages/reinvent/runmodes/RL/learning.py", line 125, in optimize
    self.sampled = self.sampling_model.sample(self.seed_smilies)
  File ".../REINVENT4/env/lib/python3.10/site-packages/reinvent/runmodes/samplers/linkinvent.py", line 67, in sample
    sampled = SampleBatch.from_list(sequences)
  File ".../REINVENT4/env/lib/python3.10/site-packages/reinvent/models/model_factory/sample_batch.py", line 123, in from_list
    assert len(transpose) == 5
AssertionError

I have no clue what is wrong here and I would appreciate it if you could give me a hand. attachments.zip

halx commented 4 months ago

Hi,

thank you very much for your interest in REINVENT and welcome to the community!

The error message is, unfortunately, rather cryptic and we have improved this since then. The issue is that the fragment separator is the pipe symbol "|" and not "." as one would expect from valid SMILES. This is a historical oversight.

BTW, the output CSV file is not just a scaffold memory anymore as it used to be in REINVENT3. There is no parameter "output_file" and we will do strict validation in future versions.

Many thanks, Hannes.

llvllahsa commented 4 months ago

Thanks for elaborating on this. It solved my problem. However, I do not understand what you mean by there is no "output_file"? You mean I shouldn't use this parameter anymore?

Thanks a lot.

halx commented 4 months ago

Yes, exaclty. "output_file" does not exist and will result in an error and termination of REINVENT.

llvllahsa commented 4 months ago

But it is working using that. So I am confused.

halx commented 4 months ago

At the moment any extraneous keys in the config file will be ignored. So that flag does effectively nothing. In the upcoming release REINVENT would terminate with an error when it finds that flag.

llvllahsa commented 4 months ago

I see, thanks for letting me know.

nbhamilton commented 4 months ago

Hi I have been following this thread and just wanted to add: In the case of libinvent this error pops up if the input smiles string has stereochemistry, as the brackets interfere with the [*] anchor point.

halx commented 4 months ago

Many thanks and welcome to the community.

The underlying reason is that this errors shows up when unsupported tokens occur. We plan to improve on error reporting in the future.

llvllahsa commented 4 months ago

Hi again,

the input I am giving now and receiving error is this: COC1CC2C(C(=O)N3CC4CC3CN(C)c3c(*)cccc34)C3CCC2(O1)C1CCN(C)C1C3|Cc1ccc(*)cc1

And another time, it was this: CCC(C)C(N)*|Cc1cn(C2CC(N=[N+]=[N-])C(C*)O2)c(=O)[nH]c1=O

Could you please help me figure out what is wrong in these two cases?

halx commented 4 months ago

You would need to provide details as to what the error is. Otherwise I won't be able to help.

llvllahsa commented 4 months ago

Thanks, the error is the same as before:

Traceback (most recent call last): File ".../REINVENT4/env/lib/python3.10/site-packages/reinvent/runmodes/RL/run_staged_learning.py", line 377, in run_staged_learning terminate = optimize(package.terminator) File ".../REINVENT4/env/lib/python3.10/site-packages/reinvent/runmodes/RL/learning.py", line 125, in optimize self.sampled = self.sampling_model.sample(self.seed_smilies) File ".../REINVENT4/env/lib/python3.10/site-packages/reinvent/runmodes/samplers/linkinvent.py", line 67, in sample sampled = SampleBatch.from_list(sequences) File ".../REINVENT4/env/lib/python3.10/site-packages/reinvent/models/model_factory/sample_batch.py", line 123, in from_list assert len(transpose) == 5 AssertionError

halx commented 4 months ago

Many thanks. This exception usually shows up when the input SMILES are not correct. Have you tried this with the latest release 4.4?

llvllahsa commented 4 months ago

I had not noticed the new update, I'll check it out, thanks.