Open lichen11 opened 5 years ago
@xinyu-intel please help take a look
@lichen11 Just to confirm are you using the latest master branch or 1.3.x branch/tags?
Hi @lichen11 , please try to add echo "USE_MKLDNN = 0" >> ./config.mk
in step 4 to fix this error.
@TaoLv yes it is the latest master branch. @xinyu-intel I added the line. But before I could verify whether the previous error can be fix, I am encountering another error: Loading required package: devtools Rscript -e "if(!require(devtools)||packageVersion('roxygen2') < '6.1.1'){install.packages('roxygen2', repo = 'https://cloud.r-project.org/')}" Loading required package: devtools Error in packageVersion("roxygen2") : package ‘roxygen2’ not found Execution halted make: *** [rpkg] Error 1
I verified that my R has roxygen2 package installed with version 6.1.1. I am unsure why it is giving such error.
@mxnet-label-bot add [R, installation]
I changed the line Rscript -e "if(!require(devtools)||packageVersion('roxygen2') < '6.1.1'){install.packages('roxygen2', repo = 'https://cloud.r-project.org/')}" to Rscript -e " devtools::install_version('roxygen2',version='6.1.1',\ repos='https://cloud.r-project.org/',quiet=TRUE)"
I am able to perform Step 5 with no error. But when I load mxnet in R, I receive the following error:
Error: package or namespace load failed for ‘mxnet’: .onLoad failed in loadNamespace() for 'mxnet', details: call: dyn.load("R-package/inst/libs/libmxnet.so", local = FALSE) error: unable to load shared object '/home/username/R-package/inst/libs/libmxnet.so': /home/username/R-package/inst/libs/libmxnet.so: cannot open shared object file: No such file or directory
I fixed it using a very odd trick.... I noticed that when I run R in the terminal, I am able to load mxnet and train NNs. However when I use R studio server, it gives the above error. I noticed that in R studio server, the lib paths are .libPaths()
[1] "/home/username/R/x86_64-redhat-linux-gnu-library/3.5"
[2] "/usr/lib64/R/library"
[3] "/usr/share/R/library"
But in R terminal, the libPath() only contains "/usr/lib64/R/library" "/usr/share/R/library"
So I just removed "/home/username/R/x86_64-redhat-linux-gnu-library/3.5" as a lib path. But what I noticed is mxnet is installed in "/home/username/R/x86_64-redhat-linux-gnu-library/3.5", not the other two paths. It is very odd that I have to remove this lib path (where mxnet is installed) in order to get mxnet working.... You guys might want to look into this also.
@lichen11 instead of removing your .libPaths()
, how about you install roxygen2
in your Rstudio server too?
Yes I installed roxygen2 on Rserver. I am still receiving the error
Loading required package: mxnet [1] "Loading local: inst/libs/libmxnet.so" Error: package or namespace load failed for ‘mxnet’: .onLoad failed in loadNamespace() for 'mxnet', details: call: dyn.load("R-package/inst/libs/libmxnet.so", local = FALSE) error: unable to load shared object '/home/user/R-package/inst/libs/libmxnet.so': /home/user/R-package/inst/libs/libmxnet.so: cannot open shared object file: No such file or directory
Changing .libPaths() works but I would like to know what the cause is and what I can do to fix the problem permanently.
@lichen11 It sounds like you are having two different R installations. They may partially share the .libPaths()
, which may cause confusion.
Can you stick with one of them, remove mxnet installation (remove the folder) and re-install? If a direct installation doesn't work, remember to try the trick from @xinyu-intel .
I checked the R versions. I am using R 3.5.1 (2018-07-02) Feather Spray on both R and R server.
I'm saying that your R and R server are different R installation, but they could have shared .libPaths()
. Therefore some packages dependencies are installed from different installations, and this has the potential to cause issues.
Please remove mxnet completely, and try to re-install mxnet in one of your installation.
I have a question on uninstalling mxnet. I am on a centos system. I tried remove.packages('mxnet') but mxnet is still there. I am also did sudo make clean in the incubator-mxnet folder, but I can still load mxnet. Online there is no specific documentation on how to uninstall mxnet R package.
Check your .libPaths()
and remove mxnet folders in each of the paths.
Thanks. I removed the folders and installed new mxnet. I am following the example https://mxnet.incubator.apache.org/versions/master/tutorials/r/mnistCompetition.html but I trained with two GPUs using
model <- mx.model.FeedForward.create(lenet, X=train.array, y=train.y, ctx=list(mx.gpu(0), mx.gpu(1)), num.round=100, array.batch.size=100, learning.rate=0.05, momentum=0.9, wd=0.00001, eval.metric=mx.metric.accuracy, epoch.end.callback=mx.callback.log.train.metric(100))
I received the following error: Auto-select kvstore type = local_update_cpu Start training with 2 devices Error in kvstore$set.optimizer(optimizer) : kvstore.cc:124: RCheck failed: names.size() == 2 && names[0] == "create.state" && names[1] == "update" Invalid optimizer
I checked using either GPU, I am able to run the code without error. Does mxnet R automatically support multiple GPUs?
@anirudhacharya do you have references to best practice for multi-GPU training in R?
@hetong007 I do not have any best practice for multi-gpu training, but this might help @lichen11 's issue - https://github.com/apache/incubator-mxnet/issues/5296#issuecomment-461608335
Double R-Package/Rpackage still makes cp have trouble finding: chris@jacie:~/.virtualenvs/mx_cv4/mxnet$ ls lib libiomp5.so libmkldnn.so.0 libmklml_intel.so libmxnet.a libmxnet.so
chris@jacie:~/.virtualenvs/mx_cv4/mxnet$ workon mx_cv4 (mx_cv4) chris@jacie:~/.virtualenvs/mx_cv4/mxnet$ python Python 3.5.2 (default, Nov 23 2017, 16:37:01) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information.
import mxnet mxnet.version '1.5.0'
sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.1 LTS Matrix products: default BLAS: /usr/local/lib64/R/lib/libRblas.so LAPACK: /usr/local/lib64/R/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.5.1 tools_3.5.1 yaml_2.2.0
chris@jacie:~/.virtualenvs/mx_cv4/mxnet$ make rpkg Makefile:313: WARNING: Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages mkdir -p R-package/inst/libs cp src/io/image_recordio.h R-package/src cp -rf lib/libmxnet.so R-package/inst/libs mkdir -p R-package/inst/include cp -rf include/ R-package/inst/include rm R-package/inst/include/dmlc rm R-package/inst/include/nnvm cp -rf 3rdparty/dmlc-core/include/ R-package/inst/include/ cp -rf 3rdparty/tvm/nnvm/include/* R-package/inst/include Rscript -e "if(!require(devtools)){install.packages('devtools', repo = 'https://cloud.r-project.org/')}" Loading required package: devtools Rscript -e "library(devtools); library(methods); options(repos=c(CRAN='https://cloud.r-project.org/')); install_deps(pkg='R-package', dependencies = TRUE)"
cp R-package/dummy.NAMESPACE R-package/NAMESPACE echo "import(Rcpp)" >> R-package/NAMESPACE R CMD INSTALL R-package
also tried the echo "USE_MKLDNN = 0" >> ./config.mk
but same results. What did the solution turn out to be?
I also have the same issue as @chris-english and @lichen11 (see detailed outputs below) and echo "USE_MKLDNN = 0" >> ./config.mk
also does not help either.
The issue clearly seems to be the replicated "R-package/R-package/" in the path where libmxnet.so is supposed to be located, since this file is located at the right path if I just remove one of duplicated "R-package/" statements.
Does anyone have a stable solution? I need to programmatically download and install R-MXNet on some remote servers. I'm not exactly sure, but it could be that one solution is:
dyn.load("R-package/inst/libs/libmxnet.so", local = FALSE)
should instead be replaced by:
dyn.load("inst/libs/libmxnet.so", local = FALSE)
Also, the fix merged at #13952 didn't seem to resolve this issue.
Detailed Commands & Output:
sudo make rpkg
Makefile:313: WARNING: Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages mkdir -p R-package/inst/libs cp src/io/image_recordio.h R-package/src cp -rf lib/libmxnet.so R-package/inst/libs if [ -e "lib/libmkldnn.so.0" ]; then \ cp -rf lib/libmkldnn.so.0 R-package/inst/libs; \ cp -rf lib/libiomp5.so R-package/inst/libs; \ cp -rf lib/libmklml_intel.so R-package/inst/libs; \ fi mkdir -p R-package/inst/include cp -rl include/* R-package/inst/include Rscript -e "if(!require(devtools)){install.packages('devtools', repo = 'https://cloud.r-project.org/')}" Loading required package: devtools Rscript -e "if(!require(roxygen2)||packageVersion('roxygen2') < '6.1.1'){install.packages('roxygen2', repo = 'https://cloud.r-project.org/')}" Loading required package: roxygen2 Rscript -e "library(devtools); library(methods); options(repos=c(CRAN='https://cloud.r-project.org/')); install_deps(pkg='R-package', dependencies = TRUE)"
cp R-package/dummy.NAMESPACE R-package/NAMESPACE echo "import(Rcpp)" >> R-package/NAMESPACE R CMD INSTALL R-package
sessionInfo()
R version 3.4.4 (2018-03-15) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.5 LTS
Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.18.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
same with two R-package in path
I am following the steps (https://mxnet.incubator.apache.org/install/ubuntu_setup.html#install-the-mxnet-package-for-r) to update my mxnet R version to 1.3 on my Centos 7.6. I am able to finish steps 1-4, but for step 5 sudo make rpkg
I receive the following errors:
** testing if installed package can be loaded [1] "Loading local: inst/libs/libmxnet.so" Error: package or namespace load failed for ‘mxnet’: .onLoad failed in loadNamespace() for 'mxnet', details: call: dyn.load("R-package/inst/libs/libmxnet.so", local = FALSE) error: unable to load shared object '/home/username/incubator-mxnet/R-package/R-package/inst/libs/libmxnet.so': libmklml_intel.so: cannot open shared object file: No such file or directory Error: loading failed Execution halted ERROR: loading failed
I am executing the code in incubator-mxnet folder. For some reason, the path has two R-packages. I can identify libmxnet.so in /home/username/incubator-mxnet//R-package/inst/libs. Which file should I fix to edit the extra "R-package" path?