swarris / Pacasus

Correction of palindromes in long reads from PacBio and Nanopore
MIT License
14 stars 3 forks source link

Unable to get Pacasus to run #3

Closed mahesh-panchal closed 7 years ago

mahesh-panchal commented 7 years ago

Hello, I wonder if you can help. I've spend a while struggling to get Pacasus installed. In the end here are the instructions I used. I installed OpenCL using the AMD installer

./AMD-APP-SDK-v3.0.130.136-GA-linux64.sh

And then installed Pacasus with.

export PATH="$TOOLS/.localpython/anaconda3/bin/:$PATH"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$TOOLS/.localpython/anaconda3/lib/"
python3.5 -m virtualenv Pacasus_env2.7 -p $TOOLS/.localpython/bin/python2.7 --system-site-packages
cd Pacasus_env2.7
. bin/activate
pip install BioPython
pip install mako
pip install cffi
pip install pyOpenCL --global-option=build_ext --global-option="-I$TOOLS/OpenCL/AMDAPPSDK-3.0/include"
git clone https://github.com/swarris/Pacasus.git
cd Pacasus
git submodule init
git submodule update

When I try to run Pacasus, I get this error.

python pacasus.py $PHOME/B2017008_Novel_Fungus/03_Assemblies/Pacasus_Test/nanopore.fasta -o $PHOME/B2017008_Novel_Fungus/03_Assemblies/Pacasus_Test/nanopore.pacasus_filtered.fasta --framework=opencl --device_type=CPU --platform_name=Intel
INFO - Initializing application...
DEBUG - Initializing Score...
DEBUG - Initializing score finished.
DEBUG - Initializing DnaRnaScore...
DEBUG - Creating matrix with parameters:
        match_score: 3,
        mismatch_score: -4,
        gap_score: -3.0,
        other_score: -1,
        any_score: 0
DEBUG - Initializing DnaRnaScore finished.
INFO - Application initialized.
INFO - Setting program...
DEBUG - Initializing aligner...
DEBUG - Initializing hitlist...
DEBUG - Initializing hitlist OK.
DEBUG - Setting SW...
DEBUG - Using OpenCL CPU implementation
Traceback (most recent call last):
  File "pacasus.py", line 12, in <module>
    ppw.run()
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pacasus/pacasusall.py", line 106, in run
    self._set_program()
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pacasus/pacasusall.py", line 82, in _set_program
    self.program = Palindrome(self.logger, self.score, self.settings)
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pypaswas/pyPaSWAS/Core/Programs.py", line 357, in __init__
    Aligner.__init__(self, logger, score, settings)
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pypaswas/pyPaSWAS/Core/Programs.py", line 52, in __init__
    from pyPaSWAS.Core.SmithWatermanOcl import SmithWatermanCPU
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pypaswas/pyPaSWAS/Core/SmithWatermanOcl.py", line 1, in <module>
    import pyopencl as cl
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/lib/python2.7/site-packages/pyopencl/__init__.py", line 37, in <module>
    import pyopencl.cffi_cl as _cl
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/lib/python2.7/site-packages/pyopencl/cffi_cl.py", line 39, in <module>
    from pyopencl._cffi import ffi as _ffi
ImportError: /afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/lib/python2.7/site-packages/pyopencl/_cffi.so: undefined symbol: clSVMFree

Could you help me resolve the error please?

It would be great if this tool could be bundled into a BioConda package or Docker container for easier installation.

Regards, Mahesh.

swarris commented 7 years ago

Hello Mahesh,

Sorry to learn you have such difficulties getting pacasus up and running. The problem you face is not with pacasus itself, but with the opencl + pyopencl installation. I checked with others more involved in this. It looks like you have installed OpenCL headers that do not match the OpenCL ICD loader (libOpenCL.so). Pyopencl now thinks it's building against an ICD loader that supports OpenCL 2 (the SVM functions are part of OpenCL 2), but libOpenCL.so then apparently doesn't provide those functions. You could try to upgrade your OpenCl library to support OpenCl 2.0. If that is not an option, you can force pyopencl to work only with OpenCL 1.2. To do that, download the source, create (or edit) a file called 'siteconf.py' containing (at least) the line:

CL_PRETEND_VERSION = "1.2"

and rebuild:

rm -Rf build pip install .

Hope this helps.

I agree that a docker (or bioconda package) would be the preferred. OpenCL / CUDA are somewhat complicated to configure in an automated way and I just did not yet have the time to set it up.

mahesh-panchal commented 7 years ago

Thank you for the quick reply.

I uninstalled and reinstalled pyOpenCL and now I have a new error which I don't understand.

pip uninstall pyOpenCL
git clone --recursive http://git.tiker.net/trees/pyopencl.git
cd pyopencl
echo "CL_PRETEND_VERSION = \"1.2\"" >> siteconf.py
pip install . --global-option=build_ext --global-option="-I$TOOLS/OpenCL/AMDAPPSDK-3.0/include"

When I run Pacasus, this is the error I get:

python pacasus.py $PHOME/B2017008_Novel_Fungus/03_Assemblies/Pacasus_Test/nanopore.fasta -o $PHOME/B2017008_Novel_Fungus/03_Assemblies/Pacasus_Test/nanopore.pacasus_filtered.fasta --framework=opencl --device_type=CPU --platform_name=NVIDIA
INFO - Initializing application...
DEBUG - Initializing Score...
DEBUG - Initializing score finished.
DEBUG - Initializing DnaRnaScore...
DEBUG - Creating matrix with parameters:
        match_score: 3,
        mismatch_score: -4,
        gap_score: -3.0,
        other_score: -1,
        any_score: 0
DEBUG - Initializing DnaRnaScore finished.
INFO - Application initialized.
INFO - Setting program...
DEBUG - Initializing aligner...
DEBUG - Initializing hitlist...
DEBUG - Initializing hitlist OK.
DEBUG - Setting SW...
DEBUG - Using OpenCL CPU implementation
DEBUG - Initializing SmithWaterman.
Traceback (most recent call last):
  File "pacasus.py", line 12, in <module>
    ppw.run()
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pacasus/pacasusall.py", line 106, in run
    self._set_program()
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pacasus/pacasusall.py", line 82, in _set_program
    self.program = Palindrome(self.logger, self.score, self.settings)
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pypaswas/pyPaSWAS/Core/Programs.py", line 357, in __init__
    Aligner.__init__(self, logger, score, settings)
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pypaswas/pyPaSWAS/Core/Programs.py", line 53, in __init__
    self.smith_waterman = SmithWatermanCPU(self.logger, self.score, settings)
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pypaswas/pyPaSWAS/Core/SmithWatermanOcl.py", line 320, in __init__
    SmithWatermanOcl.__init__(self, logger, score, settings)
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pypaswas/pyPaSWAS/Core/SmithWatermanOcl.py", line 38, in __init__
    self._set_platform(self.settings.platform_name)
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/Pacasus/pypaswas/pyPaSWAS/Core/SmithWatermanOcl.py", line 89, in _set_platform
    for platform in cl.get_platforms():
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/lib/python2.7/site-packages/pyopencl/cffi_cl.py", line 676, in get_platforms
    _handle_error(_lib.get_platforms(platforms.ptr, platforms.size))
  File "/afs/pdc.kth.se/roots/srv/tegner/v1.8/cfs/klemming/nobackup/m/mahpa906/Tools/Pacasus_env2.7/lib/python2.7/site-packages/pyopencl/cffi_cl.py", line 649, in _handle_error
    raise e
pyopencl.cffi_cl.LogicError: clGetPlatformIDs failed: <unknown error -1001>

Googling hasn't really helped me here. I checked what was in:

$ ls -a /etc/OpenCL/vendors/         
.  ..  nvidia.icd

The contents of that file are this:

$ cat /etc/OpenCL/vendors/nvidia.icd 
libnvidia-opencl.so.1

Any light you could shine on this would be appreciated. Regards, Mahesh.

swarris commented 7 years ago

You're mixing two systems. The OpenCL vendor file indicated you have an NVIDIA GPU. In that case you should use: --framework=cuda --device_type=GPU --platform_name=NVIDIA and install the NVIDIA cuda SDK: https://developer.nvidia.com/cuda-downloads

But you installed the AMD SDK. If you'd like to run Pacasus on a CPU, you also need to install the AMD OpenCL driver: http://support.amd.com/en-us/kb-articles/Pages/OpenCL2-Driver.aspx

Could you post your system configuration? Make/model CPU and GPU, etc?

I see you're using nanopore data. These reads can be very long. For the very long reads (>25kb) the OpenCL CPU is the only option because of the memory requirements of the Smith-Waterman alignment involved.

mahesh-panchal commented 7 years ago

Sorry, I was also trying things out. I also ran with the Intel parameter with the same results.

I thought I understood from the AMD SDK that it would also install the Intel drivers, but I have no idea if it did. There was no error with the installation of that.

The head node of the cluster I'm testing on has this config:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                16
On-line CPU(s) list:   0-15
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E5-2623 v3 @ 3.00GHz
Stepping:              2
CPU MHz:               3300.000
BogoMIPS:              6005.69
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              10240K
NUMA node0 CPU(s):     0-3,8-11
NUMA node1 CPU(s):     4-7,12-15

and the node I hope to run it on in future has this config

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                168
On-line CPU(s) list:   0-167
Thread(s) per core:    2
Core(s) per socket:    14
Socket(s):             6
NUMA node(s):          6
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E7-4850 v3 @ 2.20GHz
Stepping:              4
CPU MHz:               1200.203
BogoMIPS:              4406.63
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              35840K
NUMA node0 CPU(s):     0-13,84-97
NUMA node1 CPU(s):     14-27,98-111
NUMA node2 CPU(s):     28-41,112-125
NUMA node3 CPU(s):     42-55,126-139
NUMA node4 CPU(s):     56-69,140-153
NUMA node5 CPU(s):     70-83,154-167

I do not have root permissions on these..

Does the error mean I am missing the Intel drivers? What file exactly should be there? Is it possible for me to install these by myself or do I need root privileges?

swarris commented 7 years ago

You can also use the AMD loader for intel CPUs. But the ICD needs to in /etc, however, this page shows you how to get the ICD installed without root access: https://wiki.tiker.net/OpenCLHowTo#Per-user_ICD_registry.3F

mahesh-panchal commented 7 years ago

Thank you. I think it's working now. The test command is doing something.

I'm not sure which part got it working since I was a little confused about the instructions. This is what I did though.

wget http://registrationcenter-download.intel.com/akdlm/irc_nas/9019/opencl_runtime_16.1.1_x64_ubuntu_6.4.0.25.tgz
tar -xvf opencl_runtime_16.1.1_x64_ubuntu_6.4.0.25.tgz
cd opencl_runtime_16.1.1_x64_ubuntu_6.4.0.25
TGT_DIR=$PWD/intel-opencl-icd-16.1.1/lib
mkdir -p $TGT_DIR
rpm2cpio rpm/opencl-1.2-intel-cpu-6.4.0.25-1.x86_64.rpm | cpio -idmv
cp ./opt/intel/opencl-1.2-6.4.0.25/lib64/* $TGT_DIR
echo $TGT_DIR/libintelocl.so > intel.icd
export OPENCL_VENDOR_PATH=$TOOLS/OpenCL/opencl_runtime_16.1.1_x64_ubuntu_6.4.0.25

Since I wasn't sure about linking the libOpenCL.so I did this.

LD_LIBRARY_PATH="$TOOLS/OpenCL/AMDAPPSDK-3.0/lib/x86_64/sdk:$LD_LIBRARY_PATH"
pip uninstall pyOpenCL
pip install . --global-option=build_ext --global-option="-I$TOOLS/OpenCL/AMDAPPSDK-3.0/include"

Thank you for your help. Regards, Mahesh.

swarris commented 7 years ago

No problem! Glad to hear you have it up and running. Installing opencl is a pain on most systems :-(

Let me know when you need help with pacasus itself.