Closed ciropom closed 2 years ago
Hello, You should be able to find more information about the error in your rserver log (not sure about your specific installation but it could be here: /var/lib/rserver/logs/Rserve.log). Could you paste it here please?
Best, Iulian
In Rserve.log nothing related
R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(Rserve) ; Rserve(args='--vanilla --RS-workdir /opt/rock-home/work/R --RS-conf /opt/rock-home/conf/Rserv.conf')
Starting Rserve:
/opt/R/4.1.2/lib/R/bin/R CMD /storage/R/x86_64-pc-linux-gnu-library/4.1/Rserve/libs//Rserve --vanilla --RS-workdir /opt/rock-home/work/R --RS-conf /opt/rock-home/conf/Rserv.conf
R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
Rserv started in daemon mode.
>
>
Loading required package: parallel
Loading required package: parallel
Loading required package: unixtools
Loading required package: resourcer
Loading required package: R6
Loading required package: httr
Registering LocalFileResourceGetter...
Registering HttpFileResourceGetter...
Registering ScpFileResourceGetter...
Registering GridFsFileResourceGetter...
Registering OpalFileResourceGetter...
Registering MariaDBResourceConnector...
Registering PostgresResourceConnector...
Registering SparkResourceConnector...
Registering PrestoResourceConnector...
Registering TidyFileResourceResolver...
Registering ShellResourceResolver...
Registering SshResourceResolver...
Registering RDataFileResourceResolver...
Registering RDSFileResourceResolver...
Registering SQLResourceResolver...
Registering NoSQLResourceResolver...
Loading required package: parallel
Loading required package: unixtools
Loading required package: unixtools
Loading required package: sqldf
Loading required package: gsubfn
Loading required package: proto
Loading required package: RSQLite
Loading required package: parallel
Loading required package: parallel
Loading required package: unixtools
Loading required package: resourcer
Loading required package: R6
Loading required package: httr
Registering LocalFileResourceGetter...
Registering HttpFileResourceGetter...
Registering ScpFileResourceGetter...
Registering GridFsFileResourceGetter...
Registering OpalFileResourceGetter...
Registering MariaDBResourceConnector...
Registering PostgresResourceConnector...
Registering SparkResourceConnector...
Registering PrestoResourceConnector...
Registering TidyFileResourceResolver...
Registering ShellResourceResolver...
Registering SshResourceResolver...
Registering RDataFileResourceResolver...
Registering RDSFileResourceResolver...
Registering SQLResourceResolver...
Registering NoSQLResourceResolver...
Loading required package: parallel
Loading required package: unixtools
Loading required package: unixtools
Loading required package: sqldf
Loading required package: gsubfn
Loading required package: proto
Loading required package: RSQLite
Loading required package: parallel
Loading required package: parallel
Loading required package: unixtools
Loading required package: resourcer
Loading required package: R6
Loading required package: httr
Registering LocalFileResourceGetter...
Registering HttpFileResourceGetter...
Registering ScpFileResourceGetter...
Registering GridFsFileResourceGetter...
Registering OpalFileResourceGetter...
Registering MariaDBResourceConnector...
Registering PostgresResourceConnector...
Registering SparkResourceConnector...
Registering PrestoResourceConnector...
Registering TidyFileResourceResolver...
Registering ShellResourceResolver...
Registering SshResourceResolver...
Registering RDataFileResourceResolver...
Registering RDSFileResourceResolver...
Registering SQLResourceResolver...
Registering NoSQLResourceResolver...
Loading required package: parallel
Loading required package: unixtools
Loading required package: unixtools
Loading required package: sqldf
Loading required package: gsubfn
Loading required package: proto
Loading required package: RSQLite
Loading required package: parallel
Loading required package: unixtools
Loading required package: readr
Loading required package: labelled
Loading required package: parallel
Loading required package: unixtools
Loading required package: parallel
Loading required package: unixtools
Loading required package: readr
Loading required package: labelled
Loading required package: parallel
Loading required package: unixtools
Loading required package: readr
Loading required package: labelled
Loading required package: parallel
Loading required package: unixtools
Today the error is different though
Aggregated (partCov(Variables, "Wy0wLjA0NDksLTAuMDA3OSwtMC4wMDk3LC0wLjAwMTIsLTAuMDMyOSwtMC4wMTc...
Assigned expr. (Variables.scores <- pcaScores(Variables, "W1siLTAuMDEwMjQrMGkiLCIwLjAxNjExKzBpI...
Warning: Error in value[[3L]]: C stack usage 28231434 is too close to the limit
Hello,
This looks like an R memory limitation. I have a few suggestions in increasing order of probability of success :
dssSetOption(list(expressions = 500000))
Hello and thank you for your help. The workaround is not going to work, because the problem is not the number of rows (pretty limited, 64) but the number of columns (above 3000). Essentially the princomp function (but I believe also the other functions) slows down a lot as the number of columns of the table grows.
I'll try the other things and let you know. Danilo
after upgrading and adding the option, the same error
opals <- datashield.login(logins=data.frame(server='DEMO',url='http://127.0.0.1:8080',user='administrator',password='password',table='mixOmics.mixOmics.liver.toxicity'))
dssSetOption(list(expressions = 500000), datasources=opals)
datashield.assign(opals, 'D', 'mixOmics.mixOmics.liver.toxicity')
remote_pca <- dssPrincomp('D', async=F, datasources=opals)
Aggregated (partColMeans(D, FALSE)) [==================================================] 100% / 0s
Aggregated (partCov(D, "Wy0wLjA0NDksLTAuMDA3OSwtMC4wMDk3LC0wLjAwMTIsLTAuMDMyOSwtMC4wMTc0LDAuMDA...
Assigned expr. (D_scores <- pcaScores(D, "W1siMC4wMDU5OTgrMGkiLCIwLjAwMDIyMDIrMGkiLCIwLjAwNTM4K...
Error: There are some DataSHIELD errors, list them with datashield.errors()
> datashield.errors()
$DEMO
[1] "[Client error: (400) Bad Request]"
If you can reproduce the issue locally with the same dataset it will help to undestand if it is reproducible or it is an issue with my setup. You will find the instructions in the first post. Thank you Danilo
I am going to try in the following days but it seems reasonable to me that we are hitting some R limitations. Out of curiosity, does simple princomp() work on the same dataset?
and perhaps a better question: are you able to calculate the covariance matrix on the dataset?
Actually I can see that princomp() doesn't work on more variables than rows in any case and I use pretty much the same method. Granted, this is not the problem you're hitting. I mean, I think you are hitting an R limitation but even if you didn't the function would still fail. You could probably test this by taking a small subset of your data but keeping more columns than rows.
You are right this dataset is not suitable for PCA. It looks like that for PCA to work, the number of instances should be significantly larger than the number of dimensions. This is not a dsSwissKnife issue.
Hello, when the number of expression profiles grows, we have troubles executing remote PCA. For instance if we use the liver.toxicity dataset from mixOmics (you can export to csv with the following code and then import to opal)
and execute the remote principal component
after a long time, we always get an error like the one below