MolecularAI / Chemformer

Apache License 2.0
204 stars 35 forks source link

Question on sampling desirable molecules #18

Closed Doha7430 closed 1 year ago

Doha7430 commented 1 year ago

Hi! In the experment of desirable molecules generation, Chemformer used beam search to generateoutput molecules, while the Transformer and Transformer-R used greedy search. Why make sampling different? Or each sampling method will produce same result?

EBjerrum commented 1 year ago

Greedy, multinomial and beam-search can all produce different results, with the biggest chance being that the top beam-search result is the same as the one sampled with greedy search. Multinomial search usually finds solutions with a higher NLL, and will produce new solutions each time, whereas both greedy and beam-search are more-or-less deterministic. Beam search will produce a range of high-probability solutions, where greedy search only will produce one. Sometimes beam search finds a solution with a lower NLL, even though intermediate steps were maybe not the most probable at that given step. What are the Transformer and Transformer-R exactly you are refering to?

Doha7430 commented 1 year ago

Greedy, multinomial and beam-search can all produce different results, with the biggest chance being that the top beam-search result is the same as the one sampled with greedy search. Multinomial search usually finds solutions with a higher NLL, and will produce new solutions each time, whereas both greedy and beam-search are more-or-less deterministic. Beam search will produce a range of high-probability solutions, where greedy search only will produce one. Sometimes beam search finds a solution with a lower NLL, even though intermediate steps were maybe not the most probable at that given step. What are the Transformer and Transformer-R exactly you are refering to?

Thanks, Transformer and Transformer-R are here to compare with fine-tune Chemformer. Your article also shows the results of Transformer and Transformer-R in table 6. And then here is a sentence in 4.1:

However, for molecular optimisation, our Chemformer models used beam search (with a beam width of 10) to generate output molecules, while the Transformer and Transformer-R benchmarks used greedy search.

Can I take it that the top-1 beam-search with a 10 beam width is the same as greedy search?

EBjerrum commented 1 year ago

No, greedy search and beam search are not equivalent. Beam search usually get better decoding and accuracy. Although for "easy" predictions the greedy search snd beam search top-1 often give the same sequence during sampling. We discuss why we cant compare the models directly.

Sent from Yahoo Mail on Android

On Fri, 9 Dec 2022 at 19:51, @.***> wrote:

Greedy, multinomial and beam-search can all produce different results, with the biggest chance being that the top beam-search result is the same as the one sampled with greedy search. Multinomial search usually finds solutions with a higher NLL, and will produce new solutions each time, whereas both greedy and beam-search are more-or-less deterministic. Beam search will produce a range of high-probability solutions, where greedy search only will produce one. Sometimes beam search finds a solution with a lower NLL, even though intermediate steps were maybe not the most probable at that given step. What are the Transformer and Transformer-R exactly you are refering to?

Thanks, Transformer and Transformer-R are here to compare with fine-tune Chemformer. Your article also shows the results of Transformer and Transformer-R in figure 6. And then here is a sentence in 4.1:

However, for molecular optimisation, our Chemformer models used beam search (with a beam width of 10) to generate output molecules, while the Transformer and Transformer-R benchmarks used greedy search.

Can I take it that the top-1 beam-search with a 10 beam width is same as greedy search?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>