Open Arrabonae opened 1 day ago
@xjdr-alt so right now, being gpu poor, mcts runs very slowly - as expected. instead i've implemented sparkling beam search. here is a comparison of beam search vs the current implementation:
colour code: white: 'adaptive sampling'; (middle point) blue: 'branching'; (low entropy, high varentorpy), branching. red: 'resampling'; (high entropy, high varentropy) yellow: (none present): but that would be the high entropy, low varentropy - ask clarifying questions
current sampling (original): note that the name is wrong
scenario 1: static beam search on blue only: note that the name is wrong
i think the difference between the two is marginal, albeit i like beam is a little bit better.
Scenario 2: apply adaptive beam search to red (high / high): note that the name is right
Scenario 3: apply blue, static beam search and adaptive red: note that the name is right
i like scenario 3 the most. i'll keep experimenting later.
mcts to low ent / high vent branching.