wdecoster / NanoPlot

Plotting scripts for long read sequencing data
http://nanoplot.bioinf.be
MIT License
437 stars 47 forks source link

Error with bam as input #112

Closed dcopetti closed 4 years ago

dcopetti commented 6 years ago

Hello, I am able to run NanoPlot on the concatenated fastq of my raw reads, but it dies (log file attached) when I supply a bam file of the alignment to the reference. Rabiosa_bamNanoPlot_20181120_1722.log The bam was obtained by aligning the reads with minimap2 (minimap2 -ax map-ont) and reformatting with samtools. can you help me figure out why it is not working? thanks

wdecoster commented 6 years ago

Hmmm, haven't seen this error before. Which reformatting exactly did you do with samtools? Is your bam file exceptionally large?

dcopetti commented 6 years ago

Yes, the bam file is 142 GB, here attached is the NanoPlot report generated from the fastq. NanoPlot_report_raw_reads.pdf

The manipuation was something like

samtools view -@10 -b -T genome.fa file.sam -b >file.bam
samtools sort file.bam>files.bam

From the sequencing center I got directly the fastq files (not sure with which software they made the base calling). On a side note, how does the sample look like regarding the quality of the data? It is a plant sample run on a PromethION. I am wondering if the quality values (mostly between 4 and 8) are one thing to be expected for this platform. Thanks

dcopetti commented 6 years ago

I am now trying to run it on a subset of the file, made filtering out not primary alignments (samtools view -F 256): it is 64 GB only.

NanoPlot -t 10 -o Rabiosa_bam -p Rabiosa_bam --bam ONT_to_genome_256.bam  --N50 --title Rabiosa_1_2_bam_256 --store
/home/copettid/miniconda2/envs/py35/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/home/copettid/miniconda2/envs/py35/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/home/copettid/miniconda2/envs/py35/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)

these three error lines came up also with the fastq file, so I won't worry for now. Just FYI.

wdecoster commented 6 years ago

Yeah those warnings do not worry me :)

What do you get with samtools view -H yourfile.bam?

dcopetti commented 6 years ago

Good point. On the bam file that gave the error, I have the scaffolds in this order:

@HD     VN:1.5  SO:coordinate
@PG     ID:minimap2     PN:minimap2     VN:2.13-r850    CL:minimap2 -ax map-ont -t 12 180226_ryegrass_assembly_IPK_org_sm.fa ../Rabiosa_ONT_181026_2kb.fa
@SQ     SN:Lm_cp_gi_427437197_refNC_019651_1    LN:135175
@SQ     SN:Lp_mt_gi472833546_gb_JX999996_1      LN:678580
@SQ     SN:scaffold_10x_1       LN:1306062
@SQ     SN:scaffold_10x_10      LN:1526409
@SQ     SN:scaffold_10x_100     LN:5109335
@SQ     SN:scaffold_10x_1000    LN:281034
@SQ     SN:scaffold_10x_10000   LN:1252
@SQ     SN:scaffold_10x_100000  LN:500
@SQ     SN:scaffold_10x_1000000 LN:573
@SQ     SN:scaffold_10x_1000001 LN:573
@SQ     SN:scaffold_10x_1000002 LN:573

on the one running now (aligned against a different reference):

@HD     VN:1.5  SO:coordinate
@PG     ID:minimap2     PN:minimap2     VN:2.13-r850    CL:minimap2 -ax map-ont -t 7 ../Rabiosa_genome_1.0_sm_orgs.fa ../Rabiosa_ONT_181026_2kb.fa
@SQ     SN:scaffold112-1        LN:540737
@SQ     SN:scaffold112-2        LN:815233
@SQ     SN:scaffold11392-1      LN:1308993
@SQ     SN:scaffold11392-2      LN:845298
@SQ     SN:scaffold11417-1      LN:1075560
@SQ     SN:scaffold11417-2      LN:5118248
@SQ     SN:scaffold1198-1       LN:282254
@SQ     SN:scaffold1198-2       LN:2212143
@SQ     SN:scaffold1368-1       LN:2667244

it looks like I actually aligned the fasta: is that still OK?

With this second bam, I still have the thee error lines, plus now these:

[E::bgzf_read] Read block operation failed with error 4 after 132 of 592 bytes
[E::bgzf_read] Read block operation failed with error 4 after 4 of 144 bytes
[E::bgzf_read] Read block operation failed with error 4 after 732 of 2416 bytes
wdecoster commented 6 years ago

The header looks okay I think... aligning a fasta file should be okay, I think :-)

Those errors suggest something is wrong with your bam file:

wdecoster commented 5 years ago

Did you manage to solve this issue? If not, could you perhaps share your bam file for debugging?

dcopetti commented 5 years ago

after ~5 days, it is still running:

$ NanoPlot -t 10 -o genome_bam -p genome_bam --bam ONT_to_genome2s_256.bam  --N50 --title genome_1_2_bam_256 --store
/home/copettid/miniconda2/envs/py35/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/home/copettid/miniconda2/envs/py35/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/home/copettid/miniconda2/envs/py35/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
[E::bgzf_read] Read block operation failed with error 4 after 132 of 592 bytes
[E::bgzf_read] Read block operation failed with error 4 after 4 of 144 bytes
[E::bgzf_read] Read block operation failed with error 4 after 732 of 2416 bytes

I am sending you the link to the file via email

wdecoster commented 5 years ago

5 days... that's an embarrassingly long time :-/ Thanks, I'll take a look.