cbg-ethz / shorah

Repo for the software suite ShoRAH (Short Reads Assembly into Haplotypes)
GNU General Public License v3.0
39 stars 14 forks source link

No module named 'shorah_snv' #60

Closed yeemey closed 5 years ago

yeemey commented 5 years ago

Hi, I just installed shorah-1.1.3 on my local environment on a Linux server, and can't even run the example case. Here's the command, and resulting error:

./amplian.py -b ../examples/amplicon_test/ampli_sorted.bam -f ../examples/amplicon_test/reference.fasta

Traceback (most recent call last): File "./amplian.py", line 429, in <module> args.s, args.region, args.diversity) File "./amplian.py", line 303, in main import shorah_snv ModuleNotFoundError: No module named 'shorah_snv'

I also tried with shorah 1.1.2 and got the same error. I'm pretty sure I overlooked something basic, but I can't find much in the docs to help me troubleshoot. I'm running Python3.6, on Ubuntu 16.04.5, if that helps. Thanks!

DrYak commented 5 years ago

Hello,

First, there is a shortcut that might help you a lot regarding installation :

What commands sequence did you use to install shorah ? In which directory did you install shorah ? I suspect that the reason you're having problems is that you installed it in your home directory. Or that you are trying to run it from its source directory.

The reason it could be giving you these error messages is because the python module is installed in a directory which is not in Python's search path.

e.g.: if you installed in /home/username/, the installer puts the modules inside /home/username/lib/python3.6/site-packages/, which will not automatically be searched by python.

You can use: PYTHONPATH=$HOME/lib/python3.6/site-packages/ environment variable to ask python to add it into its search path.

We are in the process of changing the build system for Shorah 2.0 and there might be ways to simplify this in the future.

yeemey commented 5 years ago

I did install in my home directory, and so changed the PYTHONPATH environment variable as you suggested. Now I have a new error when I reran the example! (Both shorah and gsl are in /home/user/lib/)

./diri_sampler: error while loading shared libraries: libgsl.so.23: cannot open shared object file: No such file or directory --- Logging error --- Traceback (most recent call last): File "/home/user/lib/anaconda3/pkgs/python-3.6.6-h5001a0f_0/lib/python3.6/logging/handlers.py", line 71, in emit if self.shouldRollover(record): File "/home/user/lib/anaconda3/pkgs/python-3.6.6-h5001a0f_0/lib/python3.6/logging/handlers.py", line 187, in shouldRollover msg = "%s\n" % self.format(record) File "/home/user/lib/anaconda3/pkgs/python-3.6.6-h5001a0f_0/lib/python3.6/logging/__init__.py", line 839, in format return fmt.format(record) File "/home/user/lib/anaconda3/pkgs/python-3.6.6-h5001a0f_0/lib/python3.6/logging/__init__.py", line 576, in format record.message = record.getMessage() File "/home/user/lib/anaconda3/pkgs/python-3.6.6-h5001a0f_0/lib/python3.6/logging/__init__.py", line 338, in getMessage msg = msg % self.args TypeError: not all arguments converted during string formatting Call stack: File "./amplian.py", line 429, in <module> args.s, args.region, args.diversity) File "./amplian.py", line 350, in main ret_diri = run_child(diri_exe, diri_args) File "./amplian.py", line 63, in run_child amplog.error("Child %s terminated by signal" % exe_name, retcode) Message: 'Child ./diri_sampler terminated by signal' Arguments: (127,) Traceback (most recent call last): File "./amplian.py", line 429, in <module> args.s, args.region, args.diversity) File "./amplian.py", line 354, in main run_diagnostics(win_file, n_reads) File "./amplian.py", line 80, in run_diagnostics with open(dbg_file) as l: FileNotFoundError: [Errno 2] No such file or directory: 'w-reference-1-73.dbg'

I tried conda install -c bioconda shorah as well, but all I got in the ../env/share/shorah folder are the amplicon_test directory, ref_genome.fasta and sample_454.fasta. Where else should I be looking for the binaries?

Thank you for your prompt response!

DrYak commented 5 years ago

Hello,

Where else should I be looking for the binaries?

When installing packages, conda puts their binaries, libs, etc. in the /bin, /lib, etc. sub directories of that environment. In your case, that would be in ../env/bin, ../env/lib/python3.6/site-packages, etc.

But, if you use the command source ../env/bin/activate, this will activate the conda environment ../env - i.e.: conda will make sure that all the necessary environment variables such as PYTHON_PATH, PATH, etc. are updated as needed so packages installed in that conda environment are directly accessible from the command line in the search path.

(If you're used to work in HPC environments, the concepts are similar to environment modules typically found there).


Which also brings me to another point:

./diri_sampler: error while loading shared libraries: libgsl.so.23: cannot open shared object file: No such file or directory

As far as I know of, NO Linux distribution looks for libraries installed in ~/lib.

4 possible solutions:

  1. Ugly hack, if you're awfully lazy: put a symlink to the library in the same directory as the executable (So here: ln -s ../lib/libgsl.so.23 ~/bin), it will be searched by most linux distribution.
  2. Environment variable fix: LD_LIBRARY_PATH=$HOME/lib can be used to ask the dynamic library loader to also look for .so libraries in your home directory, in addition to the usual /usr/lib/ etc.

    Please note: for performance reasons this is discouraged in some cluster environment. Ask your local sysadmins, and use solutions 1, 3 or 4 instead.

  3. Delete the diri_sampler and fil executables in ~/bin (i.e.: the components that require gsl.so) and first do export LD_RUN_PATH=$HOME/lib, before re-compiling / re-installing them. (this adds special indications [aka an r_path] in the executable asking to also look into this precise path).
  4. If you don't want to recompile, the special indications can be manually added using patchelf --set-rpath=$HOME/lib{executable_name}
yeemey commented 5 years ago

Hi,

I reinstalled gsl through conda in my environment so that all the paths will be taken care of, and this time everything worked albeit with the expected(?) warnings. UserWarning: 60.1 % of untouched objects <should be around 90-95%> warnings.warn(unt_msg) UserWarning: posterior = 1.280 > 1 warnings.warn('posterior = %4.3f > 1' % post)

Thanks for your thorough answers! If you're open to contributors, I'm happy to make a pull request adding the recommendation to install shorah via bioconda to the README.

DrYak commented 5 years ago

Happy that your installation problems were solved.

We will indeed update the documentation: I'll see to it with the original authors of shorah.

If you have a bit of time, I would be interested if you could drop me an e-mail at ivan.topolsk@bsse.ethz.ch and tell me how you plan to use Shorah and how come your picked up our software.