Closed Doha7430 closed 1 year ago
Greedy, multinomial and beam-search can all produce different results, with the biggest chance being that the top beam-search result is the same as the one sampled with greedy search. Multinomial search usually finds solutions with a higher NLL, and will produce new solutions each time, whereas both greedy and beam-search are more-or-less deterministic. Beam search will produce a range of high-probability solutions, where greedy search only will produce one. Sometimes beam search finds a solution with a lower NLL, even though intermediate steps were maybe not the most probable at that given step. What are the Transformer and Transformer-R exactly you are refering to?
Greedy, multinomial and beam-search can all produce different results, with the biggest chance being that the top beam-search result is the same as the one sampled with greedy search. Multinomial search usually finds solutions with a higher NLL, and will produce new solutions each time, whereas both greedy and beam-search are more-or-less deterministic. Beam search will produce a range of high-probability solutions, where greedy search only will produce one. Sometimes beam search finds a solution with a lower NLL, even though intermediate steps were maybe not the most probable at that given step. What are the Transformer and Transformer-R exactly you are refering to?
Thanks, Transformer and Transformer-R are here to compare with fine-tune Chemformer. Your article also shows the results of Transformer and Transformer-R in table 6. And then here is a sentence in 4.1:
However, for molecular optimisation, our Chemformer models used beam search (with a beam width of 10) to generate output molecules, while the Transformer and Transformer-R benchmarks used greedy search.
Can I take it that the top-1 beam-search with a 10 beam width is the same as greedy search?
No, greedy search and beam search are not equivalent. Beam search usually get better decoding and accuracy. Although for "easy" predictions the greedy search snd beam search top-1 often give the same sequence during sampling. We discuss why we cant compare the models directly.
Sent from Yahoo Mail on Android
On Fri, 9 Dec 2022 at 19:51, @.***> wrote:
Greedy, multinomial and beam-search can all produce different results, with the biggest chance being that the top beam-search result is the same as the one sampled with greedy search. Multinomial search usually finds solutions with a higher NLL, and will produce new solutions each time, whereas both greedy and beam-search are more-or-less deterministic. Beam search will produce a range of high-probability solutions, where greedy search only will produce one. Sometimes beam search finds a solution with a lower NLL, even though intermediate steps were maybe not the most probable at that given step. What are the Transformer and Transformer-R exactly you are refering to?
Thanks, Transformer and Transformer-R are here to compare with fine-tune Chemformer. Your article also shows the results of Transformer and Transformer-R in figure 6. And then here is a sentence in 4.1:
However, for molecular optimisation, our Chemformer models used beam search (with a beam width of 10) to generate output molecules, while the Transformer and Transformer-R benchmarks used greedy search.
Can I take it that the top-1 beam-search with a 10 beam width is same as greedy search?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
Hi! In the experment of desirable molecules generation, Chemformer used beam search to generateoutput molecules, while the Transformer and Transformer-R used greedy search. Why make sampling different? Or each sampling method will produce same result?