Open Lucas-TY opened 5 months ago
Hello,
Thanks for your interest in our work! In our provided implementation, we set $\gamma_1 = 1$ because we observed that the performance is nearly the same for $\gamma_1 = 2$, and it decreases for larger values of $\gamma_1$. This is due to the low acceptance rate for Llama-68M. To keep things simple, our open-source code uses $\gamma_1 = 1$.
If you’d like to try using better draft models with higher acceptance rates, you can directly modify the function linked below. You only need to add an extra inner loop for $\gamma_1$:
If you have any further questions, feel free to ask.
It seems like gamma is y2, but how do you change y1?