allind / EukDetect

MIT License
43 stars 16 forks source link

Unicode error #6

Open hyphaltip opened 4 years ago

hyphaltip commented 4 years ago

I got this error when running setup:

Traceback (most recent call last):
  File "setup.py", line 5, in <module>
    long_description = fh.read()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 641: ordinal not in range(128)

to solve I did this export LC_ALL=en_US.UTF-8 then re-ran and it worked.

jayelldubya commented 1 year ago

I too received a unicode error, although at the stage of trying to run a subset of my actual samples (no errors arose during install, set-up, or testing as outlined). Below is the output from my terminal:

eukdetect --mode runall --configfile Metagenomics_I-Ching_Nachbac_configfile_TESTING_v1.yml 
01/24/2023 18:06:24:  Parsing config file ...
Traceback (most recent call last):
  File "/share/jwaters/anaconda3/envs/eukdetect/bin/eukdetect", line 33, in <module>
    sys.exit(load_entry_point('EukDetect==1.0.1', 'console_scripts', 'eukdetect')())
  File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/EukDetect-1.0.1-py3.6.egg/eukdetect/runall.py", line 140, in main
  File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/EukDetect-1.0.1-py3.6.egg/eukdetect/runall.py", line 454, in check_readlen
  File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/Bio/SeqIO/__init__.py", line 611, in parse
    for r in i:
  File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/Bio/SeqIO/QualityIO.py", line 1033, in FastqPhredIterator
    for title_line, seq_string, quality_string in FastqGeneralIterator(handle):
  File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/site-packages/Bio/SeqIO/QualityIO.py", line 897, in FastqGeneralIterator
    line = handle_readline()
  File "/share/jwaters/anaconda3/envs/eukdetect/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 12: ordinal not in range(128)

Any advice on how to proceed would be greatly appreciated! I did try the 'export LC_ALL=en_US.UTF-8' that hyphaltip recommended, but that did not resolve my issue and I had the same error.

allind commented 1 year ago

Thanks for reaching out. I can't reproduce this error and I've not been able to figure out a solution for now, though I'm still trying. It looks like the error is being thrown by BioPython in one of the check steps. Could you try running this as a snakemake pipeline directly? Full instructions are on the github, but in short it's snakemake --snakefile [path_to_install_folder]rules/eukdetect_eukfrac.rules --configfile [config file] --cores [cores] runall. If that works that will be a workaround.

jayelldubya commented 1 year ago

Thanks so much for your prompt reply! My apologies that I missed in the documentation that using snakemake can be a good workaround if running into issues with python.

I've tried running snakemake as you suggested, but I received the following errors while it was running. It did complete, but it seems the output files are empty.

/usr/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)                                                                                                                     
perl: warning: Setting locale failed.                                                                                                                                                             
perl: warning: Please check that your locale settings:                                                                                                                                            
        LANGUAGE = (unset),                                                                                                                                                                       
        LC_ALL = "en_US.UTF-8",                                                                                                                                                                   
        LANG = "C.UTF-8"                                                                                                                                                                          
    are supported and installed on your system.                                                                                                                                                   
perl: warning: Falling back to a fallback locale ("C.UTF-8").                                                                                                                                     
perl: warning: Setting locale failed.                                                                                                                                                             
perl: warning: Please check that your locale settings:                                                                                                                                            
        LANGUAGE = (unset),                                                                                                                                                                       
        LC_ALL = "en_US.UTF-8",                                                                                                                                                                   
        LANG = "C.UTF-8"                                                                                                                                                                          
    are supported and installed on your system.                                                                                                                                                   
perl: warning: Falling back to a fallback locale ("C.UTF-8").  
jayelldubya commented 1 year ago

Hi Abigail,

After adding quotes to try and change the local settings export LC_ALL="en_US.UTF-8" that might have worked, or at least I am no longer receiving locale warnings (apologies that I did not correctly understand the error, I promise I tried to google it!).

However, now when I run the snakemake file, I receive this output to the screen and my output files again appear to be empty. Can you advise what may be going on?

Building DAG of jobs...
Nothing to be done.
Complete log: /share/jwaters/Metagenomics_I-Ching/EukDetect/.snakemake/log/2023-01-27T183421.100804.snakemake.log
allind commented 1 year ago

Hi,

Thanks for your patience with a response. This suggests no steps of eukdetect ran second time because the snakemake process didn't notice any changes (missing the UTF encoding change). The best way to fix this is to delete the eukdetect output files, including the intermediate steps, and rerun the whole thing.