peterwittek / somoclu

Massively parallel self-organizing maps: accelerate training on multicore CPUs, GPUs, and clusters
https://peterwittek.github.io/somoclu/
MIT License
265 stars 68 forks source link

RCPP Not Compatible #127

Open lkoll opened 6 years ago

lkoll commented 6 years ago

I am attempting to use the R implementation of Rsomoclu.train with the following parameters on a sparse matrix, dimensions 480k x 18k.

nEpochs<-5
nx<-70
ny<-70
som_train<-Rsomoclu.train(dfm,
                          nEpoch=nEpochs,
                          nSomX = nx,
                          nSomY = ny,
                          kernelType = 2,
                          radius0 = 0,
                          radiusN = 1,
                          radiusCooling = "linear",
                          scale0 = 1,
                          scaleN = .01,
                          scaleCooling = "linear")

When I run this it fails almost instantly with the following error:

> som_train<-Rsomoclu.train(dfm,
+                           nEpoch=nEpochs,
+                           nSomX = nx,
+                           nSomY .... [TRUNCATED] 
terminate called after throwing an instance of 'Rcpp::not_compatible'
  what():  Not compatible with requested type: [type=S4; target=double].

Am I setting a parameter incorrectly?

xgdgsc commented 6 years ago

Can you provide a MWE with the dfm?

lkoll commented 6 years ago

Sorry, what is a MWE?

On Thu, Jul 19, 2018 at 12:04 AM, xgdgsc notifications@github.com wrote:

Can you provide a MWE with the dfm?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/peterwittek/somoclu/issues/127#issuecomment-406155926, or mute the thread https://github.com/notifications/unsubscribe-auth/AUTG4RkyJoQl25ghXbBNtYOa6SOThwM4ks5uIBNggaJpZM4VVlwQ .

xgdgsc commented 6 years ago

https://www.wikiwand.com/en/Minimal_Working_Example

lkoll commented 6 years ago

MWE.txt Here is the first 1000 rows of my dfm, I ran it with the same parameters I was using on the whole thing and it immediately errored out again. Not sure if you need the below or not but it's the structure of the matrix MWE.

Formal class 'dgTMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:220352] 378 303 532 652 699 920 923 966 977 239 ...
  ..@ j       : int [1:220352] 0 6 6 6 6 6 6 6 6 7 ...
  ..@ Dim     : int [1:2] 1000 17770
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x       : num [1:220352] 4 5 1 2 3 2 5 5 2 2 ...
  ..@ factors : list()
lkoll commented 6 years ago

@xgdgsc Just wanted to make sure you received my dataset, please let me know if you find anything out.

xgdgsc commented 6 years ago
library('Rsomoclu')
data("rgbs", package = "Rsomoclu")
dfm = read.csv("~/tmp/MWE.txt",sep = " ")
dfm_m = data.matrix(dfm)
nEpochs<-5
nx<-70
ny<-70
som_train<-Rsomoclu.train(dfm_m,
                          nEpoch=nEpochs,
                          nSomX = nx,
                          nSomY = ny,
                          kernelType = 0,
                          radius0 = 0,
                          radiusN = 1,
                          radiusCooling = "linear",
                          scale0 = 1,
                          scaleN = .01,
                          scaleCooling = "linear")

it' s a problem with not converting your dataframe.

and you have to remove the header:

screenshot_20180724_220356

lkoll commented 6 years ago

I don't think that's my issue. When you read in the data use the "Matrix" package readMM function as follows. It is already in sparse matrix form when I run it.

dfm<-readMM("MWE.txt")
> str(dfm)
Formal class 'dgTMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:220352] 378 303 532 652 699 920 923 966 977 239 ...
  ..@ j       : int [1:220352] 0 6 6 6 6 6 6 6 6 7 ...
  ..@ Dim     : int [1:2] 1000 17770
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x       : num [1:220352] 4 5 1 2 3 2 5 5 2 2 ...
  ..@ factors : list()
xgdgsc commented 6 years ago

Since matrix is the only type we support here. You may need to convert like:

dfm<-readMM("~/tmp/MWE.txt")
dfm_m = as.matrix(dfm)
...

It seems to take much more memory. So the recommended way of using in this case is the cmd line version.

lkoll commented 6 years ago

I was under the impression that sparse matrices were supported because of the kernel type option of "Sparse CPU" in the following documentation (https://cran.r-project.org/web/packages/Rsomoclu/Rsomoclu.pdf). So there is no sparse matrix support for this package?

Edit: The following sources also seem to indicate that there is... https://peterwittek.com/training-emergent-self-organizing-maps-with-somoclu.html https://arxiv.org/abs/1305.1422

xgdgsc commented 6 years ago

The support is mainly in the cli version. We didn' t use R intensively to think about this issue.

lkoll commented 6 years ago

It may be a good idea to update your R documentation in that case.

What is cli?

xgdgsc commented 6 years ago

https://github.com/peterwittek/somoclu#usage