Closed Boer223 closed 4 years ago
Hi @Chipcui
So, you're first filtering unmapped reads in the BAM with
samtools view -F 0x4 -bo allContigs_unsorted.filtered.bam allContigs_unsorted.bam
Have you run the following command? I would expect that you would run into issues here as well---the errors are meant to give the user some sense if there are QC problems with the inputs:
perl checkBAM_SVs_and_INDELs.pl --BAM allContigs_unsorted.bam
--referenceFasta GRCh38_full_plus_hs38d1_analysis_set_minus_alts.fa
--readsFasta allContigs.fa
--sam2alignment_executable ../src/sam2alignment
In any sense, the error Missing primary non-supplementary alignment for read 000351F_pilon_pilon_pilon
is from here:
https://github.com/NCBI-Hackathons/NovoGraph/blob/master/src/sam2alignment.cpp#L569-L580
Note that 000351F_pilon_pilon_pilon
is the read ID which was extracted from your FASTA westar.fa
here: https://github.com/NCBI-Hackathons/NovoGraph/blob/master/src/sam2alignment.cpp#L181
That FASTA westar.fa
should have been the FASTA you used to create the BAM.
So, as the error says, it looks like a primary alignment should be there, but it's not. (There are many such catches within sam2alignment.cpp
to inform the user that there are problems with the inputs.)
I expect the error could be due to the filter samtools view -q 30 -F 256
, which means you're filtering out all secondary alignments and skipping alignments with MAPQ < 30. I'm not sure why that filter would be necessary.
Thank you @evanbiederstedt
As you suggested, I have run
perl /home/cuixb/tools/biosoft/NovoGraph-1.0.0/scripts/checkBAM_SVs_and_INDELs.pl --BAM all.genomes.filtered.bam --referenceFasta westar.fa --readsFasta all.genomes.fa --sam2alignment_executable ~/tools/biosoft/NovoGraph-1.0.0/src/sam2alignment
to check the bam file, but it occurs the error again.
Read westar.fa
done.
Read all.genomes.fa
done.
terminate called after throwing an instance of 'std::runtime_error'
what(): Missing primary non-supplementary alignment for read 001834R_pilon_pilon_pilon
sh: line 1: 2913 Aborted (core dumped) /home/cuixb/tools/biosoft/NovoGraph-1.0.0/src/sam2alignment all.genomes.filtered.bam.extract.SAM westar.fa > all.genomes.filtered.bam.extract.alignments
Cannot execute command: /home/cuixb/tools/biosoft/NovoGraph-1.0.0/src/sam2alignment all.genomes.filtered.bam.extract.SAM westar.fa > all.genomes.filtered.bam.extract.alignments at /home/cuixb/tools/biosoft/NovoGraph-1.0.0/scripts/checkBAM_SVs_and_INDELs.pl line 84.
@Chipcui How did you create the BAM?
Missing primary non-supplementary alignment for read 001834R_pilon_pilon_pilon
Note that this error is a different read ID, 001834R_pilon_pilon_pilon
. These are identifiers from your FASTA file (which should be the FASTA you used to create the BAM). These are primary alignments which should exist in your BAM. You could try to find them with
samtools view input.bam | grep "001834R_pilon_pilon_pilon"
or similar commands. Trying the suggestCommands.pl
command may be useful here as well.
Hi,
I created the bam of each genome fasta file and then merged all the bam files together, the command was like this:
minimap2 --paf-no-hit -a -x asm5 --cs -r2k westar.fa genome.fa | samtools view -bhS - | samtools sort -o genome.sorted.bam -
Hi @Chipcui
It would be helpful if you output:
samtools view input.bam | grep "001834R_pilon_pilon_pilon"
There's something very strange here, as the error is suggesting for multiple reads there is no primary alignment, i.e. the read "001834R_pilon_pilon_pilon" and the read"000351F_pilon_pilon_pilon"
RE: minimap command
Could you detail the flags you're using with minimap2
somewhat? I'm still a bit confused by:
--paf-no-hit
In PAF, output unmapped queries; the strand and the reference name fields are set to ‘*’.
Warning: some paftools.js commands may not work with such output for the moment.
Hi,
As you suggested, I use samtools view all.genomes.filtered.bam | grep "001834R_pilon_pilon_pilon" >error.sam
to output the "001834R_pilon_pilon_pilon" read alignment, but I think there is non error with the read alignment. Here is the output sam file:
Hi @Chipcui
Sorry for the delay.
I'm still a bit confused by what you've sent----it's weird to have reads with supplementary alignments, but no primary alignments.
Could you please try two things: (1) The minimap2 command you used is the following:
minimap2 --paf-no-hit -a -x asm5 --cs -r2k westar.fa genome.fa | samtools view -bhS - | samtools sort -o genome.sorted.bam -
Why did you use this? Could you detail this a bit more?
(2) Could you try the exact command we have in the README, and check if the error persists?
## Align the contigs FASTA against the reference, outputting a single BAM
minimap2 -a -x asm20 GRCh38_full_plus_hs38d1_analysis_set_minus_alts allContigs.fa | samtools view -Sb - > allContigs_unsorted.bam
Thanks, Evan
Hi @Chipcui
Were you able to figure out the problem?
Best, Evan
I have tried the exact command in the novograph, but it also did not work. The error is same as above. Maybe my alignment process had some problem.
Could you share the FASTA header here? This is very strange.
You mean the each fasta sequence name? Now our server has some problem to connect, I'll share them later.
It sounds to me like a problem I faced at some point too. When you map the fastas seperately and combine the *bams
after the fact you have to make sure, that you combine the fastas too and provide the combined when you call BAM2ALIGNMENT.pl
I suspect in the FASTA-ids 000351F_pilon_pilon_pilon
will be missing.
I guess we should clarify this in the README so that users don't become confused.
I was mistaken. The error that I receive when I run sam2alignment on a concatenated bam and a singular fasta file is:
[W::fai_get_val] Reference read1 not found in FASTA file, returning empty sequence
So @Chipcui error seems to be different.
Any update on this, @Chipcui ?
Hi,
When I run the Step 1 to get the
AlignmentInput.txt
file, it occurs with the following error:Before I run this step, I also filter the bam file with
So, how to solve this error?
Thank you in advance!