Closed callumparr closed 4 years ago
Hi, it looks like there is something went wrong with NanoCount. Can you go in the folder /home/callum/master_of_pores/NanoPreprocess/work/3f/71df9957d8b2151cbe1af158b1896f and check the bam etc? Maybe is something to redirect to them.
L
ls -lhat master_of_pores/NanoPreprocess/work/3f/71df9957d8b2151cbe1af158b1896f/
total 320K
-rw-r--r-- 1 callum genome 1 May 1 17:15 .exitcode
-rw-r--r-- 1 callum genome 739 May 1 17:15 .command.err
-rw-r--r-- 1 callum genome 0 May 1 17:14 .command.out
drwxr-xr-x 2 callum genome 4.0K May 1 17:14 .
lrwxrwxrwx 1 callum genome 113 May 1 17:14 fast5_pass.minimap2.sorted.bam -> /home/callum/master_of_pores/NanoPreprocess/work/e4/c7f0dba2aae1507ddf9ff7e5bfd818/fast5_pass.minimap2.sorted.bam
-rw-r--r-- 1 callum genome 0 May 1 17:14 .command.begin
-rw-r--r-- 1 callum genome 739 May 1 15:31 .command.log
-rw-r--r-- 1 callum genome 262 May 1 15:30 .command.sh
-rw-r--r-- 1 callum genome 3.2K May 1 15:30 .command.run
drwxr-xr-x 4 callum genome 4.0K May 1 15:30 ..
Example of the BAM .
you can do a samtools flagstat of the bam. and then if is ok
singularity exec -e path_of_the_image NanoCount -i fast5_pass.minimap2.sorted.bam -o fast5_pass.count
at that point if it fails we should send and issue to NanoCount developers
output from samtools flagstat in.bam > out.flagstat
3148715 + 0 in total (QC-passed reads + QC-failed reads)
1676236 + 0 secondary
82251 + 0 supplementary
0 + 0 duplicates
3148715 + 0 mapped (100.00% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
Where can I find the location of the images for singularity. I am not so familiar with this. Is there an equivalent docker system df command?
In the root of master_of_pores you should have a folder with all the images. The one you want it should be biocorecrg-mopprepr*.img
(base) callum@dgt-gpu1:~/master_of_pores/NanoPreprocess$ singularity exec -e ../singularity/biocorecrg-mopprepr-0.2.img NanoCount work/3f/71df9957d8b2151cbe1af158b1896f/fast5_pass.minimap2.sorted.bam -o work/3f/71df9957d8b2151cbe1af158b1896f/fast5_pass.count
File "/usr/local/python/versions/3.6.3/bin/NanoCount", line 5, in <module>
from NanoCount.__main__ import main
File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanoCount/__main__.py", line 14, in <module>
from NanoCount.NanoCount import NanoCount as nc
File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanoCount/NanoCount.py", line 10, in <module>
import pandas as pd
File "/home/callum/miniconda3/lib/python3.6/site-packages/pandas/__init__.py", line 42, in <module>
from pandas.core.api import *
File "/home/callum/miniconda3/lib/python3.6/site-packages/pandas/core/api.py", line 26, in <module>
from pandas.core.groupby import Grouper
File "/home/callum/miniconda3/lib/python3.6/site-packages/pandas/core/groupby/__init__.py", line 1, in <module>
from pandas.core.groupby.groupby import GroupBy # noqa: F401
File "/home/callum/miniconda3/lib/python3.6/site-packages/pandas/core/groupby/groupby.py", line 37, in <module>
from pandas.core.frame import DataFrame
File "/home/callum/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 100, in <module>
from pandas.core.series import Series
File "/home/callum/miniconda3/lib/python3.6/site-packages/pandas/core/series.py", line 4386, in <module>
Series._add_series_or_dataframe_operations()
File "/home/callum/miniconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 10138, in _add_series_or_dataframe_operations
from pandas.core import window as rwindow
File "/home/callum/miniconda3/lib/python3.6/site-packages/pandas/core/window.py", line 14, in <module>
import pandas._libs.window as libwindow
ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by `/home/callum/miniconda3/lib/python3.6/site-packages/pandas/_libs/window.cpython-36m-x86_64-linux-gnu.so)```
I tried this also on the bam file for the library that ran through to completion for NanoPreprocess and it gave same error.
can you change in the file master_of_pores/nextflow.global.config the version of the container? from container = biocorecrg/mopprepr:0.2
to container = 'biocorecrg/mopprepr:0.4' I upgraded twice this container, so maybe it is time to upgrade in the master too.
do I need to manually pull the image to the master_of_pores/singularity ?
No. If you change that line of code it will do automatically for you.
L
Il sab 2 mag 2020, 16:52 callumparr notifications@github.com ha scritto:
do I need to manually pull the image to the master_of_pores/singularity ?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocorecrg/master_of_pores/issues/60#issuecomment-622965867, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADZ5FPI67MZUX3AGMY4FI4TRPQXTNANCNFSM4MW6KTRQ .
Updated and reran nextflow run nanopreprocess.nf with the -resume flag but it seems not to have cache so has to rebasecall. Will let you know if it progress past the nanocount.
When I try the singularity with exec it had same error.
Thanks to the authors of NanoCount it is now able to work through this BAM file. Is there a skip option for running through NanoPreprocess as I no longer have the cache for this library but I have the output events fast5s, fastq and so on.
ouch no... sorry. So what was the problem? Can you link here the issue?
https://github.com/a-slide/NanoCount/issues/5#issue-611231896
I think there is something weird about my BAM than it is a bug. I ran another library and completed fine
In this case I mapped direct RNA reads to the ensembl cDNA transcriptome fasta (release 99). I guess there is nothing wrong.
ok thanks
Sorry for another question.
I can manually generate the count file but how may I get singularity image to use the latest version NanoCount. By singularity exec -e image NanoCount....
If I run the nextflow ith -resume it still gives the same error as before so I guess it is using an older version of NanoCount
UPDATE:
I added the home path directory to the NanoCount I installed and bash .command.run now uses v0.2.1. Then running nextflow with -resume completes all the way to the end.
Can I add this other install directory path to the nanopreprocessing.nf so it always loads this into the command script when creating the work containers?
Hi, I added a new container with updated nanocount. You can do a git pull for changing the nextflow.global.config.
Hi, I added a new container with updated nanocount. You can do a git pull for changing the nextflow.global.config.
Thank you very much!
The pipeline was working well for one library and then repeated on another library and then an error came during the assign fast5 creation from the sorted alignment file. I can see the BAM file looks OK.
(base) callum@dgt-gpu1:~/master_of_pores/NanoPreprocess$ nextflow run nanopreprocess.nf -with-singularity -resume N E X T F L O W ~ version 20.01.0 Launching
nanopreprocess.nf` [friendly_mirzakhani] - revision: 75e38aee97 ╔╦╗┌─┐┌─┐┌┬┐┌─┐┬─┐ ┌─┐┌─┐ ╔═╗╔═╗╦═╗╔═╗╔═╗ ║║║├─┤└─┐ │ ├┤ ├┬┘ │ │├┤ ╠═╝║ ║╠╦╝║╣ ╚═╗ ╩ ╩┴ ┴└─┘ ┴ └─┘┴└─ └─┘└ ╩ ╚═╝╩╚═╚═╝╚═╝==================================================== BIOCORE@CRG Preprocessing of Nanopore direct RNA - N F ~ version 0.1
kit : SQK-RNA001 flowcell : FLO-MIN106 fast5 : /analysisdata/rawseq/bcl/callum/Mouse_aging/tmp/Day2_09_DRS_gzip/20200222_0821_MN22588_FAL86574_47edbd96/fast5_pass/*.fast5 reference : /home/callum/transcriptome_tutorial/Analysis/ReferenceData/Mus_musculus.GRCm38.cdna.all.fa annotation :
ref_type : transcriptome seq_type : RNA
output : /analysisdata/rawseq/bcl/callum/Mouse_aging/tmp/Day2_09_DRS_gzip/NanoPreprocess_out qualityqc : 7 granularity :
basecaller : guppy basecaller_opt : GPU : ON demultiplexing :
demultiplexing_opt : -m pAmps-final-actrun_newdata_nanopore_UResNet20v2_model.030.h5
filter : filter_opt : mapper : minimap2 mapper_opt : -uf -k14 map_type : unspliced
counter : YES counter_opt :
email : callum.parr@riken.jp executor > local (1) [06/602027] process > testInput [100%] 1 of 1, cached: 1 ✔ [f0/7968a9] process > baseCalling [100%] 401 of 401, cached: 401 ✔ [c5/f86746] process > concatenateFastQFiles [100%] 1 of 1, cached: 1 ✔ [9a/8da61f] process > QC [100%] 1 of 1, cached: 1 ✔ [b9/e3d27c] process > fastQC [100%] 1 of 1, cached: 1 ✔ [e4/c7f0db] process > mapping [100%] 1 of 1, cached: 1 ✔ [3f/71df99] process > counting [ 0%] 0 of 1 executor > local (1) [06/602027] process > testInput [100%] 1 of 1, cached: 1 ✔ [f0/7968a9] process > baseCalling [100%] 401 of 401, cached: 401 ✔ [c5/f86746] process > concatenateFastQFiles [100%] 1 of 1, cached: 1 ✔ [9a/8da61f] process > QC [100%] 1 of 1, cached: 1 ✔ [b9/e3d27c] process > fastQC [100%] 1 of 1, cached: 1 ✔ [e4/c7f0db] process > mapping [100%] 1 of 1, cached: 1 ✔ [3f/71df99] process > counting [100%] 1 of 1, failed: 1 ✘ Pipeline BIOCORE@CRG Master of Pore completed! Started at 2020-05-01T15:30:00.980+09:00 [100%] 1 of 1, cached: 1 ✔ executor > local (1) [06/602027] process > testInput [100%] 1 of 1, cached: 1 ✔ [f0/7968a9] process > baseCalling [100%] 401 of 401, cached: 401 ✔ [c5/f86746] process > concatenateFastQFiles [100%] 1 of 1, cached: 1 ✔ [9a/8da61f] process > QC [100%] 1 of 1, cached: 1 ✔ [b9/e3d27c] process > fastQC [100%] 1 of 1, cached: 1 ✔ [e4/c7f0db] process > mapping [100%] 1 of 1, cached: 1 ✔ [3f/71df99] process > counting [100%] 1 of 1, failed: 1 ✘ [- ] process > joinCountQCs - [bd/af7981] process > alnQC [100%] 1 of 1, cached: 1 ✔ [be/54255f] process > joinAlnQCs [100%] 1 of 1, cached: 1 ✔ [76/b31c71] process > alnQC2 [100%] 1 of 1, cached: 1 ✔ executor > local (1) [06/602027] process > testInput [100%] 1 of 1, cached: 1 ✔ [f0/7968a9] process > baseCalling [100%] 401 of 401, cached: 401 ✔ [c5/f86746] process > concatenateFastQFiles [100%] 1 of 1, cached: 1 ✔ [9a/8da61f] process > QC [100%] 1 of 1, cached: 1 ✔ [b9/e3d27c] process > fastQC [100%] 1 of 1, cached: 1 ✔ [e4/c7f0db] process > mapping [100%] 1 of 1, cached: 1 ✔ [3f/71df99] process > counting [100%] 1 of 1, failed: 1 ✘ [- ] process > joinCountQCs - [bd/af7981] process > alnQC [100%] 1 of 1, cached: 1 ✔ [be/54255f] process > joinAlnQCs [100%] 1 of 1, cached: 1 ✔ [76/b31c71] process > alnQC2 [100%] 1 of 1, cached: 1 ✔ [- ] process > multiQC [ 0%] 0 of 1 Error executing process > 'counting (fast5_pass)'
Caused by: Process
counting (fast5_pass)
terminated with an error exit status (1)Command executed:
NanoCount -i fast5_pass.minimap2.sorted.bam -o fast5_pass.count ; awk '{sum+=$3}END{print FILENAME"s/3.6"sum}' fast5_pass.count |sed s@.count@@g > fast5_pass.stats samtools view -F 256 fast5_pass.minimap2.sorted.bam |cut -f 1,3 > fast5_pass.assigned
Command exit status: 1
Command output: (empty)
Command error: Parse Bam file and filter low quality hits Traceback (most recent call last): File "/usr/local/python/versions/3.6.3/bin/NanoCount", line 8, in
sys.exit(main())
File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanoCount/main.py", line 48, in main
verbose =args.verbose)
File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanoCount/NanoCount.py", line 67, in init
self.read_dict = self._parse_bam ()
File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanoCount/NanoCount.py", line 163, in _parse_bam
if self.scoring_value == "alignment_score" and hit.align_score/best_hit.align_score < self.equivalent_threshold:
ZeroDivisionError: division by zero
Work dir: /home/callum/master_of_pores/NanoPreprocess/work/3f/71df9957d8b2151cbe1af158b1896f
Tip: when you have fixed the problem you can continue the execution adding the option
-resume
to the run command line Failed to invokeworkflow.onComplete
event handler-- Check script 'nanopreprocess.nf' at line: 673 or see '.nextflow.log' file for more details `
From the container running
bash .command.run
` Parse Bam file and filter low quality hits Traceback (most recent call last): File "/usr/local/python/versions/3.6.3/bin/NanoCount", line 8, in
sys.exit(main())
File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanoCount/main.py", line 48, in main
verbose =args.verbose)
File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanoCount/NanoCount.py", line 67, in init
self.read_dict = self._parse_bam ()
File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanoCount/NanoCount.py", line 163, in _parse_bam
if self.scoring_value == "alignment_score" and hit.align_score/best_hit.align_score < self.equivalent_threshold:
ZeroDivisionError: division by zero
`