Open ariel415el opened 1 year ago
I just noticed that hist.max() is the height of the highest bin (and histo.min() lowest respectively) So making sure these are around n/n_bins makes sure the histogram is uniform.
Maybe you can explain how you chose the parameters early_end = (200,320)
I think its better to use np.histogram with density=True and make sure the min and max bins are around 1/n_bins
Thanks for your interest in the paper and the code!
The choice of the early stopping criterion (200, 320)
is mainly for running speed. You may run OTS for more iterations to get closer to uniform distribution at the cost of more running time.
And yes, I agree that using density as the stopping criterion sounds more intuitive here.
Hi, Thanks for uploading the code for this interesting paper.
At the end of the paper (appendix E.3) you wrote this:
"Our empirical stopping criterion relies upon keeping a histogram of transportation targets in memory: if the histogram of targets is close to a uniform distribution (which is the distribution of training dataset), we stop OTS. This stopping criterion is grounded by our analysis in Section 3.1."
In the code this is implemented in : https://github.com/chen0706/EWM/blob/ee3c358bca08f5f91a5d91ecf826d8de47e2d014/main.py#L108-L112
This is practically does the opposite right? it seems to checks if the range of indices is small enough i.e the indices are in (n early_end[0]/n_bins, n early_end[1]/n_bins)
Is this a bug? Can you explain the actual stopping criteria used here?