novoalab / EpiNano

Detection of RNA modifications from Oxford Nanopore direct RNA sequencing reads (Liu*, Begik* et al., Nature Comm 2019)
GNU General Public License v2.0
108 stars 31 forks source link

Stopiteration issue for Epinano_Variants.py #94

Closed xiaoyanzhang-web closed 3 years ago

xiaoyanzhang-web commented 3 years ago

Hello!

I have found that several other people have this issue as well, but I can not solve the problem after changing the package versions and python version as previously suggested. In addition, I used sam2tsv.jar provided in the repo. Do you have any suggestions for this issue?

Thanks.

Huanle commented 3 years ago

Hi @xiaoyanzhang-web ,

Can you detail it a bit? what command did you use? Thanks.

xiaoyanzhang-web commented 3 years ago

Hi @Huanle,

The command I have used is: python3 /users/EpiNano/Epinano_Variants.py -n 6 -R /users/GCF_015227675.2_mRatBN7.2_rna.fna -b /users/rep.sorted.bam -s /users/EpiNano/misc/sam2tsv.jar --type t #the reference file I have indexed with samtools and CreateSequenceDictionary#

The error is : Process Process-2: Traceback (most recent call last): File "/shared/core/python/3.6.9/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/shared/core/python/3.6.9/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "Epinano_Variants.py", line 45, in split_tsv_for_per_site_var_freq firstline = next (tsv) StopIteration

Thanks.

Huanle commented 3 years ago

Hi @xiaoyanzhang-web,

thanks for sharing your command. have you created an index file with samtools faidx and a dictionary file using picard CreateSequenceDictionary ?

xiaoyanzhang-web commented 3 years ago

Hi @Huanle ,

Thank you for your help! Yes, I have created index file with samtools faidx and a dictionary file by using CreateSequenceDictionary, they were both in the /users directory.

Thanks.

Huanle commented 3 years ago

@xiaoyanzhang-web, what can you get if you run:

samtools view -h -F 3860 {bam_file} | java -jar  {sam2tsv} -r {reference_file} 
xiaoyanzhang-web commented 3 years ago

Hi @Huanle ,

I am not sure if the {bam_file} need to be replaced by own data, if so. the return as the picture shown,

Capture

Best.

Huanle commented 3 years ago

Hi @xiaoyanzhang-web ,

Sure it has to be your own files. The output seems to be fine to me. Can you copy and paste the following modules into a file -- say, requirements.txt?

atomicwrites==1.4.0
attrs==21.2.0
biopython==1.76
cloudpickle==1.6.0
dask==2.5.2
fsspec==2021.6.1
future==0.17.1
h5py==2.10.0
importlib-metadata==4.6.1
locket==0.2.1
mlpy==3.5.0
more-itertools==8.8.0
numpy==1.17.2
pandas==0.24.2
partd==1.2.0
pluggy==0.13.1
py==1.10.0
pysam==0.15.3
pytest==4.4.1
python-dateutil==2.8.1
pytz==2021.1
scikit-learn==0.20.2
scipy==1.5.4
six==1.16.0
toolz==0.11.1
typing-extensions==3.10.0.0
zipp==3.5.0

then run pip install -r requirements.txt and after this can you try again with epinano_variants.py?

xiaoyanzhang-web commented 3 years ago

Hi @Huanle ,

Thank you for your help! I have installed all these modules, but still have stopiteraion issue. Does tensorflow version matter? Since I found that it's hard to find a compatible tensorflow version and my final choice is tensorflow=1.10.0.

tensorflow version1 tensorflow version2
Huanle commented 3 years ago

Hi @xiaoyanzhang-web ,

No, tensorflow is not required. Do you mind sharing with me your bam and reference file so that I can have a look at it? The error indicates that the tsv generator is empty. My guess is that sam2tsv does not produce anything.

xiaoyanzhang-web commented 3 years ago

Hi @Huanle ,

Okay, thank you so much for your help! I will share you data.

Huanle commented 3 years ago

Hi @xiaoyanzhang-web, I built a docker image so that you can skip installing python packages.

Suppose you have your data organized in this way:

ls -l xiao_data/
drwxr-xr-x 2 scarlet scarlet       4096 Jul 27 12:11 ./
drwxrwxr-x 3 scarlet scarlet       4096 Jul 27 16:48 ../
-rw-r--r-- 1 scarlet scarlet  377767118 Jul 24 00:56 GCF_015227675.2_mRatBN7.2_rna.fna
-rw-r--r-- 1 scarlet scarlet   18499829 Jul 24 00:56 GCF_015227675.2_mRatBN7.2_rna.fna.dict
-rw-r--r-- 1 scarlet scarlet    3493117 Jul 24 00:56 GCF_015227675.2_mRatBN7.2_rna.fna.fai
-rw-r--r-- 1 scarlet scarlet 1569751598 Jul 24 00:56 rep.sorted.bam
-rw-r--r-- 1 scarlet scarlet    4025528 Jul 24 00:56 rep.sorted.bam.bai

you could run a command like this

docker run -it -d --rm --name epivar2 -v "$PWD/xiao_data/":/project/ epi12 python3 /usr/local/bin/EpiNano/Epinano_Variants.py -R /project/GCF_015227675.2_mRatBN7.2_rna.fna -b /project/rep.sorted.bam -s /usr/local/bin/EpiNano/misc/sam2tsv.jar -n 2

If you are would love to qsub it on a cluster. I recommend you build a singularity image.

singularity pull docker://huanleliu/epi12

then you can call epinano using command like:

singularity exec -e /full/path/to/epi12_latest.sif python3 /usr/local/bin/EpiNano/Epinano_Variants.py -h 

Hope this works for you.