josipd / torch-two-sample

A PyTorch library for two-sample tests
Other
237 stars 34 forks source link

Added support for median heuristic for kernel bandwidth of MMD Test #10

Open lisun-ai opened 2 years ago

lisun-ai commented 2 years ago

Hi,

Thanks for maintaining this helpful repo. I would like to add support for median heuristic for kernel bandwidth of MMD Test. Choosing a good kernel bandwidth parameter (sqrt(1/alpha) in the code) for MMD Test can be difficult. One of the most common ways in the literature to choose this bandwidth is using the median heuristic, see Ref [1] and [2], where the bandwidth is chosen to be the median of all pairwise distances. Empirical results show that this heuristic is effective and stable. Would you like to review the code and merge it? Thanks!

Ref [1] Scholkopf, Bernhard and Smola, A. J. ¨ Learning with Kernels. MIT Press, Cambridge, MA, 2002. Ref [2] Ramdas, Aaditya, et al. "On the decreasing power of kernel and distance based nonparametric hypothesis tests in high dimensions." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 29. No. 1. 2015.

Regards, Li