SofieVG / FlowSOM

Using self-organizing maps for visualization and interpretation of cytometry data
61 stars 26 forks source link

SOM should protect against inputs with no column names #41

Closed LTLA closed 3 years ago

LTLA commented 3 years ago

In the following example:

library(FlowSOM)
set.seed(1000)
x <- matrix(runif(10000), ncol=10)
out <- SOM(x, xdim=5, ydim=5)
out$mapping[,1]
##   [1] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 # and on it goes

All observations are assigned the same code - odd. Diving into the source reveals the offending line:

https://github.com/SofieVG/FlowSOM/blob/6f79b8b7afa1f2996dbeae4292bbbfb93e92b3a1/R/2_buildSOM.R#L201-L209

Here, colnames(codes) is NULL, which causes data to be subsetted to an empty numeric vector. The C code subsequently performs out-of-bounds accesses, leading to very difficult memory-related bugs downstream of using SOM().

An expedient solution would be to simply enforce identical colnames in any non-NULL value of codes. If codes=NULL, then it's derived from the data and there is no need for a separate check that the columns are matching.

SamGG commented 3 years ago

Already reported #38, but not yet implemented IIRC. I hope Sofie will have time to take care of this point.

LTLA commented 3 years ago

Well, you know what they say... when it rains, it pours.

FlowSOM inherits a memory-related problem from its dependencies, namely cytolib via flowCore; see RGLab/cytolib#45. This means that I am not able to call FlowSOM with any other package that uses global C variables, which pretty much makes it impossible to use in my bluster package (for coordinating different clustering methods).

So, I went for the nuclear option of copying all the relevant code from FlowSOM into bluster to avoid these dependency problems (and also fix this issue at the same time). I had to reimplement it in C++, though, which led to #42.

SamGG commented 3 years ago

I thought that when it rains, we get wet... I think that there are not a lot of differences between the core code of FlowSOM and the SOM implemented in the kohonen package. Did you have a look?

LTLA commented 3 years ago

Yes, and I couldn't see any change-related termination code in kohonen.

SofieVG commented 3 years ago

Hi Aaron and Samuel,

I'll try to look into this and remember our reasoning next week!

All the best, Sofie

On Fri, 8 Jan 2021, 09:55 Aaron Lun, notifications@github.com wrote:

Yes, and I couldn't see any change-related termination code in kohonen.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/SofieVG/FlowSOM/issues/41#issuecomment-756633893, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOS72Z7QFDWYPRNZREWU4DSY3B6NANCNFSM4VXA7K3A .

SofieVG commented 3 years ago

The issue regarding the colnames should be solved with the latest commit.