It seems that it always needs a complete forward of tree candidates for verify, which appears to increase the overall computational flops. For example, for "mc_sim_7b_63," each iteration requires the computation of 26 candidate tokens, but only two tokens can be accepted.
It seems that it always needs a complete forward of tree candidates for verify, which appears to increase the overall computational flops. For example, for "mc_sim_7b_63," each iteration requires the computation of 26 candidate tokens, but only two tokens can be accepted.