[Feature Request] Add Beam Search as decoding strategy

Hey there,

first, big thanks for making this awesome library open source. I wish I would have discovered it earlier, would have probably saved me several hours of work ;)

One thing I was missing though is the beam search decoding strategy. Do you already plan on adding it in the future? Otherwise I would be happy to help with that

Motivation

Beam search can lead to significant improvements of the solution quality of the trained model (e.g. Kool et al., 2018)

Solution

This would probably require a base decoding strategy class, where the different strategies (greedy, sampling, beam search) would define a step function, called in every iteration of the autoregressive decoder. Also, something like a post_decoder_hook, where the backtracking of the beam search (and the concatenation of the actions in the other sampling strategies) takes place, would be required.

Checklist

[x] I have checked that there is no similar issue in the repo (required)

ai4co / rl4co