bioFAM / MOFA

Multi-Omics Factor Analysis
GNU Lesser General Public License v3.0
231 stars 57 forks source link

Failure to install with custom python #38

Closed s-andrews closed 5 years ago

s-andrews commented 5 years ago

I'm trying to install MOFA into an existing R/3.5.1 install, using a custom compiled python 3.7. I can get the basic reticulate and MOFAData packages installed but the actual main MOFA install fails. I'll put some details (and a guess as to what I think is breaking) below.

Basic Setup

$ module load R
$ module load python3
$ which python3
/bi/apps/python/3.7.3/bin/python3

Show that python works and that mofapy is there

$ python3
Python 3.7.3 (default, May 24 2019, 14:03:09)
[GCC 5.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mofapy
>>> print(mofapy.__path__)
['/bi/apps/python/3.7.3/lib/python3.7/site-packages/mofapy']

Show that R and reticulate are working

$ R

R version 3.5.1 (2018-07-02) -- "Feather Spray"
[etc]

> library(reticulate)
> use_python("/bi/apps/python/3.7.3/bin/python3")
> py_config()
python:         /bi/apps/python/3.7.3/bin/python3
libpython:      /bi/apps/python/3.7.3/lib/libpython3.7m.so
pythonhome:     /bi/apps/python/3.7.3:/bi/apps/python/3.7.3
version:        3.7.3 (default, May 24 2019, 14:03:09)  [GCC 5.2.0]
numpy:          /bi/apps/python/3.7.3/lib/python3.7/site-packages/numpy
numpy_version:  1.16.3

python versions found:
 /bi/apps/python/3.7.3/bin/python3
 /usr/bin/python
 /opt/python/bin/python
 /opt/python/bin/python3
 /bi/home/andrewss/miniconda3/envs/graphprot/bin/python
 /bi/home/andrewss/miniconda3/bin/python

Show that MOFAdata is there

> library(MOFAdata)

(you don't see anything, but no news is good news)

Try to install MOFA

> devtools::install_github("bioFAM/MOFA", build_opts = c("--no-resave-data"))
Downloading GitHub repo bioFAM/MOFA@master
v  checking for file '/tmp/RtmpXT3Yne/remotes7ae2442a099a/bioFAM-MOFA-5785253/DESCRIPTION' (426ms)
-  preparing 'MOFA':
v  checking DESCRIPTION meta-information ...
-  installing the package to build vignettes
E  creating vignettes (1m 29.7s)
   Warning in engine$weave(file, quiet = quiet, encoding = enc) :
     Pandoc (>= 1.12.3) and/or pandoc-citeproc not available. Falling back to R Markdown v1.
   Warning in engine$weave(file, quiet = quiet, encoding = enc) :
     Pandoc (>= 1.12.3) and/or pandoc-citeproc not available. Falling back to R Markdown v1.
   Loading required package: SummarizedExperiment
   Loading required package: GenomicRanges
   Loading required package: stats4
   Loading required package: BiocGenerics
   Loading required package: parallel

   Attaching package: 'BiocGenerics'

   The following objects are masked from 'package:parallel':

       clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
       clusterExport, clusterMap, parApply, parCapply, parLapply,
       parLapplyLB, parRapply, parSapply, parSapplyLB

   The following objects are masked from 'package:stats':

       IQR, mad, sd, var, xtabs

   The following objects are masked from 'package:base':

       Filter, Find, Map, Position, Reduce, anyDuplicated, append,
       as.data.frame, basename, cbind, colMeans, colSums, colnames,
       dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
       intersect, is.unsorted, lapply, lengths, mapply, match, mget,
       order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
       rowMeans, rowSums, rownames, sapply, setdiff, sort, table,
       tapply, union, unique, unsplit, which, which.max, which.min

   Loading required package: S4Vectors

   Attaching package: 'S4Vectors'

   The following object is masked from 'package:base':

       expand.grid

   Loading required package: IRanges
   Loading required package: GenomeInfoDb
   Loading required package: Biobase
   Welcome to Bioconductor

       Vignettes contain introductory material; view with
       'browseVignettes()'. To cite Bioconductor, see
       'citation("Biobase")', and for packages 'citation("pkgname")'.

   Loading required package: DelayedArray
   Loading required package: matrixStats

   Attaching package: 'matrixStats'

   The following objects are masked from 'package:Biobase':

       anyMissing, rowMedians

   Loading required package: BiocParallel

   Attaching package: 'DelayedArray'

   The following objects are masked from 'package:matrixStats':

       colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges

   The following objects are masked from 'package:base':

       aperm, apply

   Attaching package: 'MOFA'

   The following objects are masked from 'package:Biobase':

       featureNames, featureNames<-, sampleNames, sampleNames<-

   The following object is masked from 'package:stats':

       predict

   Creating MOFA object from a MultiAssayExperiment object...
   Warning in engine$weave(file, quiet = quiet, encoding = enc) :
     Pandoc (>= 1.12.3) and/or pandoc-citeproc not available. Falling back to R Markdown v1.
   Warning in engine$weave(file, quiet = quiet, encoding = enc) :
     Pandoc (>= 1.12.3) and/or pandoc-citeproc not available. Falling back to R Markdown v1.
   'import site' failed; use -v for traceback
   Traceback (most recent call last):
     File "/bi/apps/R/3.5.1/lib64/R/library/reticulate/config/config.py", line 3, in <module>
       import os
   ImportError: No module named os
   Quitting from lines 42-56 (MOFA_example_simulated.Rmd)
   Error: processing vignette 'MOFA_example_simulated.Rmd' failed with diagnostics:
   mofapy package not found. Make sure that Reticulate is pointing to the right Python binary.
            Please read the instructions here: https://github.com/bioFAM/MOFA#installation
   Execution halted
Error in (function (command = NULL, args = character(), error_on_status = TRUE,  :
  System command error

Test the os import in our python

$ python3
Python 3.7.3 (default, May 24 2019, 14:03:09)
[GCC 5.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>>

My guess is that the install script is actually just calling python rather than python3 which is what is configured in reticulate. If I try this on the command line I get the same error as the install script.

$ python
'import site' failed; use -v for traceback
Python 2.6.6 (r266:84292, Dec  7 2011, 20:48:22)
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named os
>>>

It seems pretty likely that it's just the name of the python executable which is causing the failure, but I'm stuck for knowing where to go and look for how to change it. If you have any suggestions for where this would need to be modified / configured then that would be great.

rargelaguet commented 5 years ago

Hi Simon, thanks for the detailed explanation. I suspect that you are right and that the script is calling python instead of python3. But if you ran use_python("/bi/apps/python/3.7.3/bin/python3") before devtools::install_github("bioFAM/MOFA", build_opts = c("--no-resave-data")), then I don't understand why R is it not calling the python binary that you specified.. Not sure if this helps but sometimes I had to restart R after reticulate for everything to work.

An alternative solution is to install everything from Bioconductor (https://bioconductor.org/packages/release/bioc/html/MOFA.html), but this requires R 3.6

Hope this is useful, let me know if you can solve it.

s-andrews commented 5 years ago

I've got it working, but with a kludge!

I worked around the path issue by doing ln -s python3 python in my python3 install so that I get the right thing even if I call the wrong binary name. This allowed MOFA to install.

I did some more chasing and it doesn't look like the problem is at the level of reticulate. If you don't add the symlink then simple reticulate commands such as import("time") still work OK, so it's calling the correct python binary. It must be something further along the chain.

Anyway, it's now working here and this might help others who hit the same issue, but it might still be worth tracking down where the rogue python call is coming from.

Thanks for the help.