wdecoster / NanoPlot

Plotting scripts for long read sequencing data
http://nanoplot.bioinf.be
MIT License
429 stars 47 forks source link

using bam as an input for NanoPlot #63

Closed homeveg closed 6 years ago

homeveg commented 6 years ago

I was trying to run NanoPlot with BAM file, which was generated using BWA MEM | samtools :

bwa mem -x ont2d Homo_sapiens.GRCh38.80.dna.primary_assembly.fa QCtrimmed.fastq -t 100 | samtools view -S -F 4 -b - > QCtrimmed.mapped.bam

After sorting the BAM file with "samtools sort QCtrimmed.mapped.bam" I used it as an input for a NanoPlot:

NanoPlot --bam QCtrimmed.mapped.sorted.bam -t 100 --maxlength 2000 -o out/NanoPlot/human --readtype 1D -p QCtrimmed

The NanoPlot crashed with the following exception:

concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/lib64/python3.5/concurrent/futures/process.py", line 175, in _process_worker r = call_item.fn(*call_item.args, *call_item.kwargs) File "/usr/lib64/python3.5/concurrent/futures/process.py", line 153, in _process_chunk return [fn(args) for args in chunk] File "/usr/lib64/python3.5/concurrent/futures/process.py", line 153, in return [fn(*args) for args in chunk] File "/usr/lib/python3.5/site-packages/nanoget/nanoget.py", line 259, in process_bam samfile = check_bam(bam) File "/usr/lib/python3.5/site-packages/nanoget/nanoget.py", line 233, in check_bam if not samfile.header['HD']['SO'] == 'coordinate': File "pysam/libcalignmentfile.pyx", line 540, in pysam.libcalignmentfile.AlignmentHeader.getitem KeyError: 'HD' """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/bin/NanoPlot", line 11, in load_entry_point('NanoPlot==1.13.0', 'console_scripts', 'NanoPlot')() File "/usr/lib/python3.5/site-packages/nanoplot/NanoPlot.py", line 62, in main barcoded=args.barcoded) File "/usr/lib/python3.5/site-packages/nanoget/nanoget.py", line 74, in get_input dfs=[out for out in executor.map(extration_function, files)], File "/usr/lib/python3.5/site-packages/nanoget/nanoget.py", line 74, in dfs=[out for out in executor.map(extration_function, files)], File "/usr/lib64/python3.5/concurrent/futures/_base.py", line 556, in result_iterator yield future.result() File "/usr/lib64/python3.5/concurrent/futures/_base.py", line 405, in result return self.get_result() File "/usr/lib64/python3.5/concurrent/futures/_base.py", line 357, in get_result raise self._exception KeyError: 'HD'

wdecoster commented 6 years ago

A few questions: Does your bam file have a proper header? What's the result of samtools index QCtrimmed.mapped.sorted.bam? Can you show some lines of samtools view QCtrimmed.mapped.sorted.bam | less -S?

homeveg commented 6 years ago

I guess, the issue may be with the Samtools version or BWA itself, because the "@HD VN: SO:coordinates" line is not present in the header. I'll try to add this line manually hear is few lines from header:

samtools view -H QCtrimmed.mapped.sorted.bam @SQ SN:1 LN:248956422 @SQ SN:10 LN:133797422 @SQ SN:11 LN:135086622 ... @SQ SN:KI270728.1 LN:1872759 @SQ SN:KI270727.1 LN:448248 @SQ SN:KI270442.1 LN:392061 ... @PG ID:bwa PN:bwa VN:0.7.17-r1194-dirty CL:bwa mem -x ont2d Homo_sapiens.GRCh38.80.dna.primary_assembly.fa QCtrimmed.fastq -t 100

samtools view QCtrimmed.mapped.sorted.bam | less -L 7b379322-416d-4594-a44a-bdce2c99baf1 0 1 11003 0 1S3M1D69M4D33M13I1M3I34M1D20M2D14M1I7M3I19M2D1M1D25M1I4M4I15M1I7M1D7M1I6M41S 0 0 CGCCACGGGTGAGA dad61f96-7dac-4db6-910d-62389f9aa0f1_1 16 1 12150 0 1S19M1D11M1D2M1D7M1I3M1I13M1D14M1D10M1I4M1D2M1D28M1I19M1D48M1I10M1D4M3I19M11S 0 0 CTAATACCACGACC c54a8cbe-520d-4bf5-b64d-3c92b2838525 0 1 14546 0 12M2D13M2D10M2D16M3D4M1D1M1D11M1D46M1D14M1D1M1D18M1D2M1D9M1I4M1D29M2I18M3I8M1I10M1I34M1D18M2D8M3D9M2D21M1I7M2D8M3D16M1

wdecoster commented 6 years ago

You should always make sure to use up to date versions, so you indeed may want to check your samtools and bwa.

Using a recent samtools version you also don't need the -S flag, and you could also pipe your samtools view directly to samtools sort.

homeveg commented 6 years ago

well, unfortunately, I can't re-install samtools to the latest, and have to use slightly outdated version. But anyway, after I've edited manually the header of the SAM file, everything is working now. The issue is closed now. Thanks for you help!

wdecoster commented 6 years ago

Good to hear.

If you cannot update samtools because you don't have root permission (which I also don't have) you can install anaconda or miniconda and have samtools installed in your home directory.