I am using angsd to produce fasta sequences, these are automatically gzipped fasta files.
If I open the fasta file with R-Biostrings the sequence compositon looks like this:
dna<-readDNAStringSet("WSBg.asm5.fa.gz")
> alphabetFrequency(dna[1])
A C G T M R W S Y K V H D B N - + .
[1,] 53974765 37689595 37636633 53870814 0 0 0 0 0 0 0 0 0 0 11982472 0 0 0
However, for the same fasta.gz file the sequence composition with pyfastx looks like this:
Hi,
I am using angsd to produce fasta sequences, these are automatically gzipped fasta files.
If I open the fasta file with R-Biostrings the sequence compositon looks like this:
However, for the same fasta.gz file the sequence composition with pyfastx looks like this:
Could you please indicate what the
'\x00'
would mean?Can it be that
pyfastx
can not correctly index read these gzipped files?Thank you in anticipation
Best regards
Kristian