pRESTO is part of the Immcantation analysis framework for Adaptive Immune Receptor Repertoire sequencing (AIRR-seq). pRESTO is a bioinformatics toolkit for processing high-throughput lymphocyte receptor sequencing data.
This seemed to be working fine before but recently (maybe new Biopython) this example fails when masking the V primers.
#!python
File "/usr/local/bin/MaskPrimers.py", line 644, in <module>
maskPrimers(**args_dict)
File "/usr/local/bin/MaskPrimers.py", line 475, in maskPrimers
primers = readPrimerFile(primer_file)
File "/usr/local/lib/python3.4/dist-packages/presto/IO.py", line 41, in readPrimerFile
for p in primer_iter}
File "/usr/local/lib/python3.4/dist-packages/presto/IO.py", line 40, in <dictcomp>
primers = {p.description: str(p.seq).upper()
File "/usr/local/lib/python3.4/dist-packages/Bio/SeqIO/__init__.py", line 591, in parse
for r in i:
File "/usr/local/lib/python3.4/dist-packages/Bio/SeqIO/FastaIO.py", line 124, in FastaIterator
for title, sequence in SimpleFastaParser(handle):
File "/usr/local/lib/python3.4/dist-packages/Bio/SeqIO/FastaIO.py", line 45, in SimpleFastaParser
line = handle.readline()
File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 582: ordinal not in range(128)
The problem seems to be with Greiff2014_VPrimers.fasta, there are some extra non-printable characters at the end of the file. If delete those lines at the end then everything runs fine.
Original report by Scott Christley (Bitbucket: [Scott Christley](https://bitbucket.org/Scott Christley), ).
This seemed to be working fine before but recently (maybe new Biopython) this example fails when masking the V primers.
The problem seems to be with Greiff2014_VPrimers.fasta, there are some extra non-printable characters at the end of the file. If delete those lines at the end then everything runs fine.