novoalab / EpiNano

Detection of RNA modifications from Oxford Nanopore direct RNA sequencing reads (Liu*, Begik* et al., Nature Comm 2019)
GNU General Public License v2.0
108 stars 31 forks source link

Issue Running Epinano_Variants.py #55

Closed eliah-o closed 4 years ago

eliah-o commented 4 years ago

Hello,

I'm trying to get the feature extraction script Epinano_Variants.py to run on a bam file. I used the following commend:

python ../software/EpiNano-Epinano1.2.0/Epinano_Variants.py \
-R ../../flair_pipeline/refs/hg19.fa \
-T g \
-s ../software/EpiNano-Epinano1.2.0/misc/sam2tsv.jar \
-b ../../flair_pipeline/stepONE_flair_align_output/Elf1_2i.bam \
-t 6

When I run this, my program outputs the following error message, and then hangs without terminating the program.

Process Process-2:
Traceback (most recent call last):
  File "/net/gs/vol3/software/modules-sw/python/3.7.7/Linux/CentOS7/x86_64/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/net/gs/vol3/software/modules-sw/python/3.7.7/Linux/CentOS7/x86_64/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/net/hawkins/vol1/Stem_Cell_Nanopore/epinano/software/EpiNano-Epinano1.2.0/epinano_modules.py", line 187, in split_tsv_for_per_site_var_freq
    head = next(tsv)
StopIteration

Do you have any idea why this may be happening?

Thanks!

bryanmkevan commented 4 years ago

I ran into this issue. This has to do with the next() command in Python and its interaction with multiple processes at once. I isolated it to one single process, extracted the error, and eventually tracked it down to a pandas version incompatibility. Switch to pandas 0.24.2. They changed something in the command for reading a file when they did a big update recently.

Huanle commented 4 years ago

Hi @bryanmkevan, thanks heaps for your answer.

Hi @eliah-o, I tried installing packages on differen computers with centos and ubuntu OS in the way below:

python -m venv epinano1.2_venv

source epinano1.2_venv/bin/activate

pip install pandas==0.25.1

pip install dask==2.5.2

Epinano_Varaints.py works. Maybe you could also try this way with python virtual environemnt. BTW, i have been using python3.6.

eliah-o commented 3 years ago

Thank you both for the suggestions. I created a conda virtual environment and tried both recommended versions of pandas using python 3.6 (I had previously been running 3.7). Unfortunately neither one fixed the issue. I'm still working on debugging the problem. @Huanle Do you have any further suggestions?

Celinet21 commented 3 years ago

@Huanle when I use the above installation instructions on a python virtual environment I get an error: ModuleNotFoundError: No module named 'pysam'.

Huanle commented 3 years ago

@Celinet21 , you need to install pysam. pip install pysam