Open mbuttner opened 3 months ago
ping @berombau
I like the idea of using the official FlowSOM, but I'd be curious how it compares to @burtonrj's implementation in terms of speed.
We could keep a wrapper in pytometry for visibility and backwards-compatibility that requires flowSOM as an optional dependency. Refering to it (including some of it's nice visualization) in the tutorial/documentation sounds good to me.
Since FlowSOM_Python is implemented by the original authors and integrated within the scverse ecosystem, it makes sense to use it. The solution proposed by @grst sounds good to me!
And it would also be great to check if the results are consistent for the two packages, but this requires quite some work.
Hi everyone, thank you for the discussion. We're ok with these proposed actions. The flowsom package itself depends on pytometry currently for a function normalize_estimate_logicle
, but I'll try to make this and other dependencies optional so you can easily integrate flowsom with minimal dependencies. The MuData dependency is minimal and will probably stay.
The current implementation depends on Numba for speed. There is ongoing work on a batched SOM training update that would further increase parallelization, which we hope to conclude by the summer.
Alternative versions can reuse the scverse integration and visualizations of our package by implementing flowsom.models.BaseFlowSOMEstimator
. It's even possible to mix-and-match the models for overclustering and metaclustering, but this is mostly for benchmarking. We can add additional models if that would provide better continuity for users. We can try to make it as consistent as possible, but this is indeed not that trivial. Sometimes there are slight differences and a previous analysis will not be fully reproducible. It's easier to work with containers or an older package version then.
I added some of these changes in https://github.com/saeyslab/FlowSOM_Python/tree/interop-pytometry in preparation for a 0.0.2 version. The pytometry package is now an additional install as explained in the notebook. We do require the 0.1.5 version not yet released on PyPI https://github.com/scverse/pytometry/issues/69.
Hi @berombau thank you for the update. My PyPI account is still not operational and there has been no response to my account recovery request from PyPI in the past two months. I keep working on it!
Hi @berombau I recovered access to my PyPI account and uploaded pytometry version 0.1.5: https://pypi.org/project/pytometry/0.1.5/
So in FlowSOM_Python we need the 0.1.5 version for the pytometry function normalize_autologicle
. A PyPI installation still does not work because of the pandas issue, which is now fixed in https://github.com/burtonrj/consensusclustering/issues/1 and pytometry version 0.1.6.dev5. So with a PyPI 0.1.6 release, I think the installation issue will be resolved.
Hi everyone,
following a brief discussion with @burtonrj in #71: There is a Python implementation of
FlowSOM
by the original authors (https://github.com/saeyslab/FlowSOM_Python), which offers a comprehensive functionality of FlowSOM clustering and effectively carried over the functionality of theFlowSOM
R package. It depends onscVerse
packages likepytometry
andMuData
. Thepytometry
package currently uses @burtonrj's implementation of FlowSOM, which depends on the packagesminiSOM
andconsensusclustering
. Hence, we have two parallel implementations here, where efforts could be more integrated, and second, we would like to reduce the number of dependencies inpytometry
(see #64) as part of the governance strategy.Possible actions
@burtonrj suggested to
FlowSOM
functionality frompytometry
and thereforeconsensusclustering
andminisom
.FlowSOM
implementation of @saeyslab and add documentation accordingly.As a perspective, one should start a discussion about the integration of the
FlowSOM
package in thescverse
.I am happy with this suggestion in general and like to suggest some modifications to provide continuity for all users who are already using the current FlowSOM implementation in
pytometry
:consensusclustering
andminisom
optional dependencies in the next version.consensusclustering
and replace current example with a pointer to theFlowSOM
python package.I'd like to hear @grst and @quentinblampey thoughts on this.