Modified LLE? - Githubissues

lucapinello commented 7 years ago

Fantastic library!

Would it be possible to have the modified version of the LLE implemented in sklearn? http://scikit-learn.org/stable/modules/generated/sklearn.manifold.LocallyLinearEmbedding.html

mmp2 commented 7 years ago

Glad you like it! We do have a LLE implementation, but I am not 100% sure it has all the options from sklearn. What in particular do you think it is missing?

Best, Marina Meila

On 5/5/17, 3:42 PM, "Luca Pinello" notifications@github.com wrote:

Fantastic library! Would it be possible to have the modified version of the LLE implemented in sklearn? http://scikit-learn.org/stable/modules/generated/sklearn.manifold.LocallyL inearEmbedding.html ‹ You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mmp2/megaman/issues/71, or mute the thread https://github.com/notifications/unsubscribe-auth/AFapZTRkmRPP11GA2DLMQrD eszPHbsAOks5r26XjgaJpZM4NSf6M.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e 4bb","name":"GitHub"},"entity":{"external_key":"github/mmp2/megaman","titl e":"mmp2/megaman","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/1 43418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url ":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0 b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/mmp2/megaman"}},"updates":{"snippets":[{ "icon":"DESCRIPTION","message":"Modified LLE? (#71)"}],"action":{"name":"View Issue","url":"https://github.com/mmp2/megaman/issues/71"}}}

jmcq89 commented 7 years ago

Hi Lucapinello,

As you've noticed we currently only support standard LLE rather than modified LLE or HLLE. This is something we can definitely support in the future but will need to take the time to implement it carefully.

If this is something you're interested in implementing yourself feel free to put out a pull request other wise we can add it to the list of methods to add.

Cheers, James

lucapinello commented 7 years ago

Wow thanks for the super quick reply! I have used your implementation of the modified LLE in sklearn for some biological datasets and it works really well. The problem is that there I don't have the radius options. Marina if you have a simple implementation of the modified LLE with the radius that would be perfect! James I am not sure I am capable to implement it myself in a reasonable time, but happy to test it!

jmcq89 commented 7 years ago

Unfortunately Modified LLE (MHLLE: https://papers.nips.cc/paper/3132-mlle-modified-locally-linear-embedding-using-multiple-weights.pdf) is a non-trivial modification of normal LLE (which is included in megaman).

As you've noted sklearn only offers a K-nearest neighbors matrix whereas we prefer a radius neighbors matrix -- for geometric reasons. Adjusting the sklearn code/adding MLLE to megaman to use a radius based kernel would require substantial amount of programming time to implement and ensure validity etc.

What is the timeline for your project? It is unlikely that I, personally, will be able to implement this this month as I also have my thesis defense, but it's possible we could find another contributor to do this or I can work on this towards June.

lucapinello commented 7 years ago

James of course! Your defense is the most important thing. I can definetively wait and if you want when you have time tell you more about the use case we have in mind (single cell genomics data). Thanks again to you and Marina for the quick reply.

huidongchen commented 7 years ago

Hi, James and Marina,

Awesome package! I'm co-working with Luca. Thanks a lot for you guys quick reply. We are very excited to implement your tool in our RNA-seq data analysis.

We tried LLE but it seems the result looks different from the normal LLE of scikit-learn package with the same parameters. We are not sure whether we missed some parts in it. I attached the sample data and .ipynb file we are using. We would really appreciate it if you can take a look when you have time. Many thanks!

Example.zip

mmp2 commented 7 years ago

Hi Luca and Huidong,

the default values in our API are often different from scikit-learn's. Since sklean was written the understanding of ML has advanced and our package implements the state-of the art (this is one reason we have radius neighborhoods and not k-nn).

If your results changed, it is possible that you may want to try a different radius. In general, no fixed radius will give you the same results as k-nn. I would also recommend using SpectralEmbedding which implements diffusion maps. The DM algorithm has emerged as the more theoretically well founded (which means it will give fewer artefacts in the embeddings) of the tractable algorithms. Isomap is also good but much slower.

If you want to really know what your algorithm is doing (and you will be amazed by what you will see), try the Riemannian metric -- it will display how your data was distorted by the embedding.

Best,

Marina

On 5/5/17, 9:06 PM, "Huidong Chen" notifications@github.com wrote:

Hi, James and Marina, Awesome package! I'm co-working with Luca. Thanks a lot for you guys quick reply. We are very excited to implement your tool in our RNA-seq data analysis. We tried LLE but it seems the result looks different from the normal LLE of scikit-learn package with the same parameters. We are not sure whether we missed some parts in it. I attached the sample data and .ipynb file we are using. We would really appreciate it if you can take a look when you have time. Many thanks! Example.zip https://github.com/mmp2/megaman/files/980669/Example.zip ‹ You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mmp2/megaman/issues/71#issuecomment-299614079, or mute the thread https://github.com/notifications/unsubscribe-auth/AFapZexj4Q4AtCfEKyEIoY7 WtTJQvn9wks5r2_GrgaJpZM4NSf6M.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e 4bb","name":"GitHub"},"entity":{"external_key":"github/mmp2/megaman","titl e":"mmp2/megaman","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/1 43418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url ":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0 b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/mmp2/megaman"}},"updates":{"snippets":[{ "icon":"PERSON","message":"@huidongchen in #71: Hi, James and Marina,\r\n\r\nAwesome package! I'm co-working with Luca. Thanks a lot for you guys quick reply. We are very excited to implement your tool in our RNA-seq data analysis. \r\n\r\nWe tried LLE but it seems the result looks different from the normal LLE of scikit-learn package with the same parameters. We are not sure whether we missed some parts in it. I attached the sample data and .ipynb file we are using. We would really appreciate it if you can take a look when you have time. Many thanks!\r\n\r\n\r\nExample.zip\r\n\r\n\r\n\r\n"}],"action":{"name":"View Issue","url":"https://github.com/mmp2/megaman/issues/71#issuecomment-29961 4079"}}}

huidongchen commented 7 years ago

Hi, Marina,

Thanks for your detailed comments and suggestion. It's really helpful!

I have another quick question, we noticed that for the Geometry class, the neighbor number can be set like "adjacency_kwds = {'n_neighbors':20}" . We are kind of confused since LLE of Megaman is based on radius instead of neighbor number. Thats why we were hoping to see the similar result as LLE of scikit-learn shows when we set the same neighbor number. So can I understand it as the radius still being used even if we set the neighbor number ?

Thanks again for your kind and patient help.

Best, Huidong

mmp2 / megaman

Modified LLE? #71