a-h-b / dadasnake

Amplicon sequencing workflow heavily using DADA2 and implemented in snakemake
GNU General Public License v3.0
45 stars 17 forks source link

complaining about GB locale #25

Closed splaisan closed 2 years ago

splaisan commented 2 years ago

I finally got the snake to work but a number of Jobs are failing, reporting issues with locale variables

one example below. my locales are defined as US

$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

Can I do something to fix this or ignore it?

Activating conda environment: /opt/biotools/dadasnake/conda/b7228a92baaacc5aaa092b362d244b49
/usr/bin/bash: line 1: warning: setlocale: LC_ALL: cannot change locale (en_GB.utf8): No such file or directory
[Tue Feb 22 14:10:58 2022]
Error in rule multiqc:
    jobid: 87
    output: stats/multiqc_filtered_report_data, stats/multiqc_filtered_report.html
    log: logs/multiqc_filtered.log (check log file(s) for error message)
    conda-env: /opt/biotools/dadasnake/conda/b7228a92baaacc5aaa092b362d244b49
    shell:

        export LC_ALL=en_GB.utf8
        export LANG=en_GB.utf8
        multiqc -n stats/multiqc_filtered_report.html stats/fastqc_filtered >> logs/multiqc_filtered.log 2>&1

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Job failed, going on with independent jobs.

digging in the log for multiqc I read this:

cat logs/multiqc_filtered.log
Traceback (most recent call last):
  File "/opt/biotools/dadasnake/conda/b7228a92baaacc5aaa092b362d244b49/bin/multiqc", line 6, in <module>
    from multiqc.__main__ import multiqc
  File "/opt/biotools/dadasnake/conda/b7228a92baaacc5aaa092b362d244b49/lib/python3.6/site-packages/multiqc/__main__.py", line 53, in <module>
    multiqc.run_cli(prog_name="multiqc")
  File "/opt/biotools/dadasnake/conda/b7228a92baaacc5aaa092b362d244b49/lib/python3.6/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/opt/biotools/dadasnake/conda/b7228a92baaacc5aaa092b362d244b49/lib/python3.6/site-packages/click/core.py", line 1043, in main
    _verify_python_env()
  File "/opt/biotools/dadasnake/conda/b7228a92baaacc5aaa092b362d244b49/lib/python3.6/site-packages/click/_unicodefun.py", line 100, in _verify_python_env
    raise RuntimeError("\n\n".join(extra))
RuntimeError: Click will abort further execution because Python was configured to use ASCII as encoding for the environment. Consult https://click.palletsprojects.com/unicode-support/ for mitigation steps.

This system supports the C.UTF-8 locale which is recommended. You might be able to resolve your issue by exporting the following environment variables:

    export LC_ALL=C.UTF-8
    export LANG=C.UTF-8

Click discovered that you exported a UTF-8 locale but the locale system could not pick up from it because it does not exist. The exported locale is 'en_GB.utf8' but it is not supported.
a-h-b commented 2 years ago

Hi Stéphane - thanks for your question. have you tried the steps highlighted in the link in the log file https://click.palletsprojects.com/en/8.0.x/unicode-support/ ? -A

splaisan commented 2 years ago

running a grep command finds en_GB.utf8 in many places! I am not supposed to install other locales on this server

our messages crossed, I check now thanks

splaisan commented 2 years ago

OK I have the C version in the locale -aoutput so I guess I only need to add at the top of dadasnake

export LC_ALL=C.UTF-8
export LANG=C.UTF-8

right?

I already added there

unset R_HOME
unset R_LIBS

after complains about my ENV already having these variables defined

splaisan commented 2 years ago

does not seem to fix it, still get the error after clear run

a-h-b commented 2 years ago

Hi - I think I can point you to the rule to add it to. Are you using paired or single end sequencing and which kind of pooling? -A

splaisan commented 2 years ago

Thanks a lot for your help, this is pacbio HiFi data for now, testing your pipeline with Park et al mock data

a-h-b commented 2 years ago

so, within your dadasnake folder, you can try if it helps to change the lines

export LC_ALL=en_GB.utf8
export LANG=en_GB.utf8

in the file workflow/rules/dada.single.pool.smk to your suggestion

a-h-b commented 2 years ago

that's line 124-125 in the current version

splaisan commented 2 years ago

Thanks a lot Anna, will try this

splaisan commented 2 years ago

I finally installed the GB locale and it worked, I was afraid it would interfere with my system but it is only installed and does not interfere (apparently!)