lmdu / pyfastx

a python package for fast random access to sequences from plain and gzipped FASTA/Q files
https://pyfastx.readthedocs.io
MIT License
262 stars 23 forks source link

UnicodeDecodeError: 'utf-8' codec can't decode bytes in position #56

Closed Muchmorepig closed 1 year ago

Muchmorepig commented 1 year ago

image python version: 3.8.12 pyfastx version: 0.8.4

import pyfastx 
fq2 = pyfastx.Fastq("./tmp/tmp2_2.fq.gz")
for read in fq2:
    print(read.seq)

here is the header of my fq file:

@A01225:536:HFNG7DSX5:3:1101:1217:1000 2:N:0:TAAGGCGA+GCGATCTA
CAAGCGTTGGCTTCTCGCATCTGTACGGTGTCAACGCTTAATCCACGTGCTTGAGAGGCCAGAGCATTCGACACAGAATTTTTTTTTTTTTTCGGTTAGTACGAGATGGCTTCACACCTCGCGGTGCATCCAACTCTCGCCTATCGAAGT
+
,FFFFFF:FFFFFFFFFFFFFFFFFFFFFFF:FF:FFFF:,FFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFF:FFFFFFFFFFFF:F,FF::FFF:FF,:FF,:F,FFFFF,FFF:FFFF,FFFF,F,FFF:,F:FFF,FF:F
@A01225:536:HFNG7DSX5:3:1101:1253:1000 2:N:0:TAAGGCGA+GCGATCTA
CAAGCGTTGGCTTCTCGCATCTTCGTCCATCATACTAGTAATCAACGTGCTTGAGAGGCCAGAGCATTCGAAGATCTTTTTTTTATTTTTTTTTGTTTTTGTTTTAAAAAATTTATGAAAAAATTGATACGCCAATTATTACTAAAACTC
+
:F:,F,FFFFFF,FFFFFFFFFF:,FF:FFFF,F,FFFFFFFFF,FFFFFFFFFFFF,F,FFF,FFFFF,F,F,,,:FFFFFFF,F:F:FFFF:,F:,FF,,::F:,FFFF,FF,,,,F,FF,F:::,:,,F,:F,:F,F:,:F:,F,F,
Muchmorepig commented 1 year ago

can anyone help me ?

lmdu commented 1 year ago

I have test pyfastx with fastq sequence you have provided, I did not suffer from the same. If it's convenient, would you please send me the whole test file. Thank you for reporting this issue, I found another bug in Fastq parser. By the way, your running environment does not seem to like terminal.

lmdu commented 1 year ago

Fixed in verions of 0.9.1.

Muchmorepig commented 1 year ago

thanks