Closed StefanGliga closed 11 months ago
@henk717 this looks good to me, have you tested yet for a merge?
Have not been able to review it yet (See discord for details), its the first one on my list once I am able to review.
Can't reproduce the issue I had with it anymore, so ill merge it and hopefully we don't break userspace.
I have implemented epsilon and eta sampling from https://arxiv.org/abs/2210.15191 . In the UI I opted to expose it in units of 1e-4, to be similar to ooba. Tested both pytorch/gpu and TPU(jax_static specifically) and everything seems to work fine.