Closed devonorourke closed 5 years ago
I think this was because I set up my virtual environment with python3, not python2. I'm getting a similar error with python2 now though... so ...?
On Wed, Sep 26, 2018 at 10:02 AM Jon Palmer notifications@github.com wrote:
Should be python-edlib. But you shouldn’t need to specify just install with conda install amptk.
Jon
On Sep 26, 2018, at 6:48 AM, devonorourke notifications@github.com wrote:
Closed #42.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/amptk/issues/42#issuecomment-424726999, or mute the thread https://github.com/notifications/unsubscribe-auth/AKqgXDQZkSGROyEg04LIL2vgZc2BR8f9ks5ue4kCgaJpZM4W6sUH .
-- Devon O'Rourke Graduate student in Molecular and Evolutionary Systems Biology University of New Hampshire
It should run on py2 or py3. But might be related to the compiler used. Try this: conda create -n amptk python=3.6 gcc amptk
The can activate env with conda activate amptk. Sometimes the compiler libraries are missing, in this case edlib maybe built with conda gcc which is maybe missing on your current env?
On Sep 26, 2018, at 7:07 AM, devonorourke notifications@github.com wrote:
I think this was because I set up my virtual environment with python3, not python2. I'm getting a similar error with python2 now though... so ...?
On Wed, Sep 26, 2018 at 10:02 AM Jon Palmer notifications@github.com wrote:
Should be python-edlib. But you shouldn’t need to specify just install with conda install amptk.
Jon
On Sep 26, 2018, at 6:48 AM, devonorourke notifications@github.com wrote:
Closed #42.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/amptk/issues/42#issuecomment-424726999, or mute the thread https://github.com/notifications/unsubscribe-auth/AKqgXDQZkSGROyEg04LIL2vgZc2BR8f9ks5ue4kCgaJpZM4W6sUH .
-- Devon O'Rourke Graduate student in Molecular and Evolutionary Systems Biology University of New Hampshire — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Okay, so following that install:
conda create -n amptk python=3.6 gcc amptk
I get a host of compiler related clobber
issues once the installation is almost done...
SafetyError: The package for amptk located at /mnt/lustre/macmaneslab/devon/.conda/pkgs/amptk-1.2.4-py36r3.4.1_0
appears to be corrupted. The path 'opt/amptk-1.2.4/lib/amptklib.py'
has a sha256 mismatch.
reported sha256: d77dcaf93d7fe79599d734f74c05424591ab9f1be2dc5e023cb0b929346c3875
actual sha256: 831fe2ea3ec4dabbf0740ffa4c97008f3f63d52df41efce2ae44945a415930b5
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libasan.so'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libatomic.so'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libatomic.so.1'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libgcc_s.so'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libgcc_s.so.1'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libgomp.so'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libgomp.so.1'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libgomp.so.1.0.0'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libitm.so'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libitm.so.1'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libitm.so.1.0.0'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libquadmath.so'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libquadmath.so.0'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libquadmath.so.0.0.0'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libtsan.so'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libtsan.so.0'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libtsan.so.0.0.0'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'share/info/libgomp.info'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgcc-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'share/info/libquadmath.info'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgfortran-3.0.0-1, conda-forge::libgcc-7.2.0-h69d50b8_2, defaults::gcc-4.8.5-7
path: 'lib/libgfortran.so.3'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgfortran-3.0.0-1, conda-forge::libgcc-7.2.0-h69d50b8_2, defaults::gcc-4.8.5-7
path: 'lib/libgfortran.so.3.0.0'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libgfortran-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libgfortran.so'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libstdcxx-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libstdc++.so'
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge::libstdcxx-ng-7.2.0-hdf63c60_3, defaults::gcc-4.8.5-7
path: 'lib/libstdc++.so.6'
And then if I run amptk --version
:
(amptk) [devon@premise ~]$ amptk --version
/mnt/lustre/macmaneslab/devon/.conda/envs/amptk/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/mnt/lustre/macmaneslab/devon/.conda/envs/amptk/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
Traceback (most recent call last):
File "/mnt/lustre/macmaneslab/devon/.conda/envs/amptk/bin/amptk", line 2, in <module>
import amptk
File "/mnt/lustre/macmaneslab/devon/.conda/envs/amptk/opt/amptk-1.2.4/bin/amptk.py", line 15, in <module>
import lib.amptklib as amptklib
File "/mnt/lustre/macmaneslab/devon/.conda/envs/amptk/opt/amptk-1.2.4/lib/amptklib.py", line 14
import edlib-aligner as edlib
I think there was some weird lingering edit that remained in the $HOME//.conda/envs/amptk/opt/amptk-1.2.4/lib/amptklib.py
file where I had substituted:
import edlib-aligner as edlib
I noticed also that edlib
wasn't installed with:
conda create -n amptk python=3.6 gcc amptk
So I did:
conda install edlib
And because of that weird numpy error:
conda update numpy
And now I don't have any issue with running amptk --version
So does that mean that python-edlib is not installing the edlib backend?
On Sep 26, 2018, at 7:54 AM, devonorourke notifications@github.com wrote:
I think there was some weird lingering edit that remained in the $HOME//.conda/envs/amptk/opt/amptk-1.2.4/lib/amptklib.py file where I had substituted:
import edlib-aligner as edlib I noticed also that edlib wasn't installed with:
conda create -n amptk python=3.6 gcc amptk So I did:
conda install edlib And because of that weird numpy error:
conda update numpy And now I don't have any issue with running amptk --version
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
I don't know. I'm just looking at what was installed in the virtual environment bin
, and there wasn't any edlib-aligner
program initially until I manually installed it.
I don't see any python-edlib
at the moment, but could also manually install
For what it's worth, amptk illumina
appears to be running now, once I've made those modifications (installing edlib
and updating numpy
)
Right, the python-edlib installs the python bindings for edlib, ie allowing you to “import edlib” into the script. Conda install edlib will only install the C code backend. When I get some free time I need to cut a new release as I made some enhancements to amptk stats awhile ago and might be a small bug in taxonomy. After I get a new release I will experiment with the bioconda packaging.
On Sep 26, 2018, at 8:13 AM, devonorourke notifications@github.com wrote:
For what it's worth, amptk illumina appears to be running now, once I've made those modifications (installing edlib and updating numpy)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Managed to get amptk illumina
to work but now getting stuck on the Dada2 clustering step.
Recall that my virtual environment was created with:
conda create -n amptk python=3.6 gcc amptk
The error in the Rscript.log file is a cluster of things, 1) It can't seem to detect a mirror, and 2) it can't figure out how to load the libraries. A little snippet of that error at the beginning:
Warning: failed to download mirrors file (cannot open URL 'https://cran.r-project.org/CRAN_mirrors.csv'); using local file '/mnt/lustre/macmaneslab/devon/.conda/envs/amptk/lib/R/doc/CRAN_mirrors.csv'
Warning message:
In download.file(url, destfile = f, quiet = TRUE) :
URL 'https://cran.r-project.org/CRAN_mirrors.csv': status was 'Couldn't connect to server'
Loading required package: ShortRead
Loading required package: BiocGenerics
Loading required package: methods
Loading required package: parallel
...
...
Loading required package: BiocParallel
Error: package or namespace load failed for ‘BiocParallel’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/mnt/lustre/macmaneslab/devon/R/x86_64-pc-linux-gnu-library/3.4/BiocParallel/libs/BiocParallel.so':
/mnt/lustre/macmaneslab/devon/.conda/envs/amptk/lib/R/bin/exec/../../lib/../.././libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /mnt/lustre/macmaneslab/devon/R/x86_64-pc-linux-gnu-library/3.4/BiocParallel/libs/BiocParallel.so)
Failed with error: ‘package ‘BiocParallel’ could not be loaded’
I figured if the program can't download them, then it can't install them. So I manually went into the virtual environment and tried to do that:
source("https://bioconductor.org/biocLite.R")
biocLite("BiocGenerics")
I also tried things from CRAN:
install.packages("Rcpp", repos='http://cran.us.r-project.org')
The error that I receive in each of these cases is the library loading message:
Error: package or namespace load failed for ‘Rcpp’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/mnt/lustre/macmaneslab/devon/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/libs/Rcpp.so':
/mnt/lustre/macmaneslab/devon/.conda/envs/amptk/lib/R/bin/exec/../../lib/../.././libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /mnt/lustre/macmaneslab/devon/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/libs/Rcpp.so)
Error: package ‘Rcpp’ could not be loaded
I tried:
conda install libgcc
... but that didn't work any better
All I can find in the forums is something along the lines of sudo apt-get install libstdc++6
, but I don't have sudo privileges on my compute cluster.
Likewise, I thought maybe updating Conda from the current v4.5.4 to the newest v4.5.11 might help, but I again ran into sudo problems.
Any thoughts? Maybe I should ditch DADA2 and just use UNoise? What's that cryptic line in your docs about Likewise, the output data from UNOISE3 is the same as DADA2 and UNOISE2, although “it works better”….? What clustering algorithm to you trust?
One other thing: I also tried specifying the Conda-specific .libPath
during the install, and it didn't work either:
install.packages("tidyverse", repos='http://cran.us.r-project.org', lib='/mnt/lustre/macmaneslab/devon/.conda/envs/amptk/lib/R/library')
Gives the same error
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/mnt/lustre/macmaneslab/devon/R/x86_64-pc-linux-gnu-library/3.4/bindrcpp/libs/bindrcpp.so':
/lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /mnt/lustre/macmaneslab/devon/R/x86_64-pc-linux-gnu-library/3.4/bindrcpp/libs/bindrcpp.so)
ERROR: lazy loading failed for package ‘dbplyr’
So sounds like the R packages aren’t installed correctly in the conda package. There is an R channel for conda, but if you go to anaconda cloud you can search for all available packages. You need to find the conda packages for those missing R dependencies. Usually they look like r-package or for dada2 they are listed as bioconductor-dada2. Note - I’ve had problems with the R packages on conda as well.... but have let tried in awhile. But these errors seem to be related to the compiler libraries that are missing - ie the package was compiled with a librarynot in your system. Conda theoretically should be managing this, but they switched compilers a few months back that perhaps is still causing some issues.
Right. That's what I was manually trying to do. The missing R packages are both within CRAN and within Bioconductor. My issue is that I can't install them within R itself. If I'm hearing you right, it's to try to install those packages through anaconda, rather than manually doing it with R? I'll give that a shot
I'm still curious though - could I just use UNoise instead? What's the upside/downside of DADA2 vs. UNoise?
Sure. In practice dada2 and unoise3 are very similar. Dada2 is a little bit more aggressive in chimera filtering the last time I compared, but will depend on the versions of each software. Unoise3 is much much faster than dada2. Both ESV pipelines in amptk run an additional 97% clustering step, so user can use either data downstream. Whether you should use ESV or exact sequence variants versus clustering (uparse) depends on the amplicon - there are many where I don’t think the ESVs make biological sense.
I think the bulk of these issues were because the compute nodes on my cluster don't have internet access, so when the R script wants to download/upgrade packages, the program dies.
It was only after going back into R and manually downloading ggplot2
that an error message became obvious: ggplot2
install required another R library, and only after that program was installed could ggplot2
be installed, and only after that could dada2 run.
But after long last, after that manual install, it's working just fine. Weird.
Okay well good to know. I added those automatic download scripts as a convenience option to download and install missing packages, but sounds like I should just delete that so errors are more simple: ie ggplot2 not installled.....
The "auto-install" in the R scripts was removed. Re-open if this is still an issue.
Should be python-edlib. But you shouldn’t need to specify just install with conda install amptk.
Jon