Closed carden24 closed 4 years ago
Thanks, @carden24. I will add the change into the next release.
Surprisingly, SUPER-FOCUS's users have formated the same database file and it is the first time I see this error.
Best
Fixed - Thanks
It is a very unusual error indeed. You will only see it if you have a hit against that subject in the database. I do not know if it only shows in my version of python3 (3.6.10) or csv (1.0). Thanks for the quick fix. feel free to close the issue.
gotcha! thanks again.
I run into problems parsing diamond alignments created with the latest version of superfocus ( SUPER-FOCUS 0.34, on Apr 2, 2019)
The issue is that one of your sequences in your fasta files in the database has non-utf characters.
I found them using this command:
The cultrip is this sequence:
Which apparently looks fine but if you check the characters, it has a weird one ^V=SYN (Synchronous idle). ^$ is the end of line character.
I found this problem in the 100_clusters.fasta file.
This issue can be solved by adding the option " , encoding='ISO-8859-1' " to the parse_alignments function of the do_alignment.py. Ideally you should try co fix your database issue first.
Before: with open(alignment) as alignment_file:
After: with open(alignment, encoding='ISO-8859-1') as alignment_file: