SofieVG / FlowSOM

Using self-organizing maps for visualization and interpretation of cytometry data
61 stars 26 forks source link

Add seed for automatic meta cluster inference #35

Closed ghar1821 closed 3 years ago

ghar1821 commented 4 years ago

Hi Sofie, I just realised that there is no way to set a seed when FlowSOM is used to automatically infer the optimal number of meta clusters. I have tried passing the seed as seed parameter, but it didn't seem to use it at all. Having the ability to set seed is handy as then I can repeatedly rerun the automatic inference and get the same result. Otherwise, each time I run it, the meta clustering looks different.

I've looked through the code, and I think the change is relatively simple, just explicitly pass the seed parameter into MetaClustering, DetermineNumberOfClusters, and the consensus function (see attached screenshot).

What do you think?

Screen Shot 2020-08-26 at 7 18 31 pm
SofieVG commented 4 years ago

Hi Givanna,

Thank you for your suggestion, I agree that this is not the expected behaviour and your adaptations make sense! We are currently working on a big update of the package (not on Github yet, but should be there soon), and I will make sure to include it there. In the meantime, would it be an option for you to use set.seed(seed) right before calling the FlowSOM code or is this approach not sufficient?

I'd also like to suggest if you are trying out automatically inferring the number of clusters, to make sure your max value is big enough. If the max value is close to the optimal number of clusters, the tool has difficulties finding the right elbow-point and will typically underestimate.

All the best, Sofie

On Thu, 27 Aug 2020 at 09:38, Givanna Putri notifications@github.com wrote:

Hi Sofie, I just realised that there is no way to set a seed when FlowSOM is used to automatically infer the optimal number of meta clusters. I have tried passing the seed as seed parameter, but it didn't seem to use it at all. Having the ability to set seed is handy as then I can repeatedly rerun the automatic inference and get the same result. Otherwise, each time I run it, the meta clustering looks different.

I've looked through the code, and I think the change is relatively simple, just explicitly pass the seed parameter into MetaClustering, DetermineNumberOfClusters, and the consensus function (see attached screenshot).

What do you think? [image: Screen Shot 2020-08-26 at 7 18 31 pm] https://user-images.githubusercontent.com/5366317/91411662-fe229980-e88b-11ea-845e-17d81215af15.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/SofieVG/FlowSOM/issues/35, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOS72273FIXCH7BVW5APXLSCYEOHANCNFSM4QMVPBYA .

ghar1821 commented 4 years ago

Thank you for the suggestion Sofie. Will definitely make sure that the max value is high enough.

Unfortunately, set.seed does not work as the ConsensusClusteringPlus function is called with no seed, and thus will set its own seed (using set.seed), overriding previous set.seed. See attached.

Screen Shot 2020-08-28 at 1 34 37 pm
SofieVG commented 3 years ago

Hi Givanna,

This should finally be solved in the latest update! Thanks for your help!