hsidky / dmaps

C++ Accelerated Python Diffusion Maps Library
MIT License
23 stars 4 forks source link

Question on re-weighting biased data #3

Closed yihengwuKP closed 2 years ago

yihengwuKP commented 2 years ago

Hi, in the README.md, it's mentioned that the package is able to re-weight the kernel for biased input. I wonder how to use that feature since it's not mentioned in the README.

I also can't find the "example folder" that's mentioned in the README. The link to the blog that's mentioned in other issues is also down. Also, is the reweighting scheme the same as the umbrella integrated dmap paper?

hsidky commented 2 years ago

Hi. Yes, sorry I have not done a good job at maintaining this repo. You can provide a vector of length N where N is the number of observations in the dataset as the second argument to the DiffusionMap constructor. This library does not compute the weights as discussed in the paper you cited. However, if you calculate the weights using umbrella sampling, you can provide them to the library and you should be good to go.

Hope this helps!

yihengwuKP commented 2 years ago

Hi Hythem, many thanks for your quick reply, this helps a lot! I'll try that! Is your implementation about the reweighting in any of the three citations you provided? Thanks!

hsidky commented 2 years ago

It's actually a generic implementation allowing you to provide an arbitrary weight for an observation. If you perform umbrella sampling as described in the paper you cited, you could compute weights for each configuration and use it in this implementation. Alternatively, you can do things as outlined in the paper and it still can be used with this library,

yihengwuKP commented 2 years ago

Thanks for your detailed explanation! Btw thanks for sharing your implementation of diffusion map, it's really helpful and saved me a lot of time!