DiffusionMapsAcademics / pyDiffMap

Library for diffusion maps
MIT License
46 stars 14 forks source link

precompute option for kernel? #11

Open ralfbanisch opened 6 years ago

ralfbanisch commented 6 years ago

Hey guys,

so scikit-learn has a precompute option for the metric that allows users to pass their own precomputed distance matrix over to the neighbour search object. This already works with pydiffmap thanks to the **kwargs feature. This can be very nice because if you use for example RMSD metric, then passing a custom RMSD metric to scikit learn turns out to be a lot slower then just precomputing all the RMSD's with e.g. mdtraj.

However, the precomputed distance matrix has to be dense. So there are memory issues.

Should we add a 'precompute kernel' feature that allows users to precompute their own sparse kernel matrix and pass it to pydiffmap? Presumably that would mean that our kernel class simply gets bypassed and the sparse kernel matrix precomputed by the user would be used instead. This would add a lot of flexibility (people could try out their own kernels if they wanted to, and so on).

The difficulty would be that out of sample extensions wouldn't work, since they require the kernel to be computed at additional evaluation locations. So we'd have to raise a flag that this is not supported for precomputed kernels.

devmessias commented 4 years ago

How can I pass a precomputed distance matrix?