vtraag / leidenalg

Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.
GNU General Public License v3.0
596 stars 78 forks source link

README details on support for R #59

Closed evanbiederstedt closed 3 years ago

evanbiederstedt commented 3 years ago

There is no standalone version of leidenalg, and you will always need python to access it. There are no plans for developing a standalone version or R support. So, use python.

FWIW, we have a pared-down version in R here: https://github.com/kharchenkolab/leidenAlg https://cran.r-project.org/web/packages/leidenAlg/index.html

It's possible this is worth mentioning, or not :)

Just letting you know. Feel free to close issue.

Best, Evan

TomKellyGenetics commented 3 years ago

Hi Evan,

Thanks for sharing this. I've developed an R package to call the Python module via 'reticulate' also:

https://github.com/TomKellyGenetics/leiden https://cran.r-project.org/web/packages/leiden/index.html

Of course, removing Python dependencies would be ideal. I'm aware that @vtraag has added cluster_leiden to the igraph development version but is is not yet available on CRAN. It has less functionality than the python version, is this also the case for your package? https://github.com/igraph/rigraph/issues/346

I agree the README is outdated as there are several R versions now available, although I'm grateful this was made clear originally, as it motivated me to build an R package myself.

Thanks,

Tom K

evanbiederstedt commented 3 years ago

HI @TomKellyGenetics

It has less functionality than the python version, is this also the case for your package?

I'm not really sure, to be honest. You might be a better judge than I. https://github.com/kharchenkolab/leidenAlg

We did include the entire C++ library, so adding more functionality wouldn't be too tricky. If you have feature requests, please let me know.

I agree the README is outdated as there are several R versions now available, although I'm grateful this was made clear originally, as it motivated me to build an R package myself.

I'm sympathetic to this :) Maybe it's a good idea to share other versions now, given they've been implemented. Or not---I'm not really sure what the "right call" is.

Thanks, Evan

TomKellyGenetics commented 3 years ago

Thanks Evan,

The 2 R packages are complementary in my view but I suppose it's up to @vtraag on whether to include them in the README. As for limited functionality of native R versions, see here for further discussion. I am planning to migrate to calling R implementations without Python dependencies where possible but decided to retain reticulate to call functions only available in Python or C++.

Of course, an Rcpp version should have better performance than reticulate so I'm glad to hear you have been working on this. There's another one called leidenbase but I believe it's not available on CRAN so yours would be easier to install.

Thanks,

Tom K

evanbiederstedt commented 3 years ago

Ah, I see. Thank you for the context, @TomKellyGenetics

Yes, I'm not necessarily advocating anything with this issue---feel free to ignore it, or add the R packages in the README, or anything really :)

My motivation was simply to mention there are some (basic) R implementations around. Interested users may want to access them.

Best, Evan

vtraag commented 3 years ago

@evanbiederstedt, nice to see that you developed an R interface for the C++ code, instead of working through reticulate as @TomKellyGenetics has done. I assume that the igraph object in R could then be immediately used, instead of having to be "translated" to Python? If was my understanding that this lead to a number of problems when using reticulate, didn't it @TomKellyGenetics?

I would be more than happy to include a reference in the README.md! Alternatively, we could also collaborate towards providing full support for the package in R, I know some people are also waiting to be able to use multilayer community detection in R from this package.

TomKellyGenetics commented 3 years ago

Just to clarify, multiplex graphs are now supported (version 0.3.6 on CRAN) in the reticulate version: https://github.com/TomKellyGenetics/leiden/issues/7

If this could be done in Rcpp instead, it would avoid memory issues with large graphs for example. There also seems to be a bug which caps the "leiden" package at 100,000 nodes currently. It is stable and gives exactly the same results as the Python "leidenalg" but there are several reasons for poorer performance with the python-R interface. This is the same reason I'd prefer to call leiden in "r-igraph" once it becomes available on CRAN.

evanbiederstedt commented 3 years ago

Hi @vtraag

I would be more than happy to include a reference in the README.md! Alternatively, we could also collaborate towards providing full support for the package in R, I know some people are also waiting to be able to use multilayer community detection in R from this package.

Both options work for me! I'm happy to help collaborate towards providing full support (and I do like the idea of leidenAlg being included in the README :) ). You may have to update me on the details regarding what needs to be done. However, I'm happy to work towards improving the package.

I assume that the igraph object in R could then be immediately used,

I believe this is right, yes.

If this could be done in Rcpp instead, it would avoid memory issues with large graphs for example.

CC @TomKellyGenetics

Yes, if it works with the C++ work, it should not result in memory issues when using an Rcpp interface. That's really all we've done with leidenAlg.

As I say, if you give me a few more details as to what needs to be done, I'm happy to help. We should possibly take this in an email thread.

Best, Evan

TomKellyGenetics commented 3 years ago

@brgew As an aside, I notice that you contributed leiden-related functions to igraph and rigraph. Do you consider those to be ready for 'production' use? If so, I may consider using those rather than leidenbase.

@vtraag Yes, they are ready for use, but the implementation is much less flexible than what is provided in this package. Nonetheless, it might just cover 80% of the use cases of people.

I must admit that I forgot about leidenbase, @evanbiederstedt also developed an R package to interface with leidenalg and @TomKellyGenetics developed a R package based on reticulate, see #59 for some discussion. Perhaps it would be an opportunity to join forces and make a common supported implementation available that exposes all the functionality in the leidenalg package in a native R package, including the multiplex support?

As discussed in #62 multiple R implementations are now available. Happy to chat about merging native R versions together (benefits of improved performance and less dependencies are clear I think). Main drawback at the moment seems to be limited features compared to the Python module (which leiden calls with reticulate). I'm open to adding them to the leiden package and replacing python calls where applicable (as has already been done for the development version of igraph pending release of these dependencies on CRAN).

vtraag commented 3 years ago

I just made a new release (version 0.8.7) in which I added the mentioned R packages in the README.