Closed neverix closed 5 months ago
Hi, thanks for pointing this out and sorry for the slow response.
I have double-checked and the code used in our experiments is the same as in the implementation we link to. It seems that this is a mistake in the algorithm in the paper, and is not intentional.
It looks like the difference is the upper bound on the log-uniform distribution is $\frac{N}{n\text{group} - 0.999}$ in the code and $\frac{N-1}{n\text{group}}$ in the algorithm. These will be quite different when:
I would say that the versions in the code make more sense as they allow for the broader distribution - e.g. when we want to space three points across a distance $N$ then it is possible to space them up to $\frac{N}{2}$ apart, not just $\frac{N}{3}$. Arguably an ideal expression for the upper bound would be $\frac{N}{n\text{group} - 1}$, but I think we edited this to $\frac{N}{n\text{group} - 0.999}$ as a simple way to avoid nans when $n_\text{group} = 1$.
We will do an update of the paper on arxiv in the next few weeks and fix the algorithm there.
Thanks again for pointing this out!
The paper states that group scales are sampled uniformly when creating training samples.
However, the implementation subtracts 0.999 from
n_group
before exponentiating. This means that the scale can explode 1000x. Is this intentional?