Closed davidlougheed closed 2 years ago
maf-sort
sorts them into a wrong order: it sorts the alignments by their location in the genome, whereas they should be sorted in query-sequence order.
They should be in the right order to start with, I have no idea why not... how exactly did you get the .maf
?
thanks for the prompt response!
the maf is generated with something like this:
bedtools bamtofastq -i "${bam}" -fq "${scratch}/${fq}"
last-train -P32 -Q0 hg38db "${scratch}/${fq}" > "${scratch}/${par}"
lastal -P32 -p "${scratch}/${par}" hg38db "${scratch}/${fq}" | last-split > "${scratch}/${maf}"
where bam/fq/par/maf are variables pointing to different files
the weird part of this workflow I guess is that I'm converting CCS reads from a BAM back to a fastq file; I've been able to do this with some individuals from the 1000Genomes dataset (which worked) and now I'm trying it on HG002 from Genome in a Bottle.
That seems like it should work fine, and the alignments should be in the right order. Scratching my head...
lastdb
in a strange way?By the way, this should sort them into the right order (though if 2 reads have the same name, we have bigger problems):
maf-swap myfile.maf | maf-sort | maf-swap > out.maf
lastdb was ran like this:
lastdb -P32 -uNEAR hg38db ./ref/hg38.analysisSet.fa
if the DB could cause it, it's possible that I need to regenerate it for GiaB's reference FASTA file – they are both hg38, but it might be slightly different vs. 1000 genomes I suppose.
I'll take a look into those other possibilities as well!
ah I found the issue - there do seem to be duplicates of each read in the FASTQ file for some reason, so I'll mark the issue as closed. thanks for all the help!
edit - if anyone in the future comes across this, using Picard's SAM to FASTQ instead of bedtools
seems to fix the issue
Thanks for sharing the solution!
I'm getting the error
tandem-genotypes: error: the alignments are in the wrong order
when I run tandem-genotypes. I've ran the.maf
file throughmaf-sort
but the error persists. Any ideas on what could cause this?