FelixKrueger / TrimGalore

A wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data
GNU General Public License v3.0
459 stars 149 forks source link

TrimGalore calls pigz when not required #181

Closed kaushikr3 closed 9 months ago

kaushikr3 commented 9 months ago

Hi Felix, I was trying to run TrimGalore on some Illumina data on the cluster. I am getting this error when I run this command through a sbatch script.

COMMAND:

trim_galore --paired --gzip --fastqc --output_dir "$out_dir" "${f}" "${f/_R1_001.fastq.gz/_R2_001.fastq.gz}"

ERROR:

SUMMARISING RUN PARAMETERS Input filename: ~/Illumina_DNA_Reads/Yao_1_S240_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.10 Cutadapt version: 2.10 Number of cores used for trimming: 1 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Using smallRNA adapter for trimming (count: 13). Second best hit was Illumina (count: 0) Adapter sequence: 'TGGAATTCTCGG' (Illumina small RNA adapter; auto-detected) Maximum trimming error rate: 0.1 (default) Optional adapter 2 sequence (only used for read 2 of paired-end files): 'GATCGTCGGACT' Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 18 bp Running FastQC on the data once trimming has completed Output file will be GZIP compressed

This is cutadapt 2.10 with Python 3.8.12 Command line parameters: -j 1 -e 0.1 -q 20 -O 1 -a TGGAATTCTCGG ~/Illumina_DNA_Reads/Yao_1_S240_R1_001.fastq.gz Run "cutadapt --help" to see command-line options. See https://cutadapt.readthedocs.io/ for full documentation.

cutadapt: error: [Errno 13] Permission denied: 'pigz' ~

Even though I haven't used multicore processing, it's failing for some reason. Another important thing to note is, I ran the same code a couple of months ago and everything worked perfectly.

Things I have done to try and solve it: I downloaded pigz and added it to my path but that didn't seem to solve the issue, irrespective of whether I added it to the local path or the bashrc root.

Please let me know what I can do to solve this!

NOTE: I can't update the python or cutadapt version on the cluster because I don't have admin access

Kaushik

FelixKrueger commented 9 months ago

Hi @kaushikr3

Hmm, I think this probably isn't really a Trim Galore issue as such, but most likely is an infrastructure/software permission issue that would be best solved by your sys admin team. We can still try to do some trouble shooting here.

Using Conda/Mamba

Probably the easiest way to get everything working would be to set up a conda environment (or rather mamba for speed reasons). You just need to add the bioconda channel, and then run:

Install mamba

curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
bash Mambaforge-$(uname)-$(uname -m).sh

Add bioconda channel:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict

Create a new environment:

mamba create --name trimit
conda activate trimit

Install Trim Galore:

mamba install trim-galore

This should take a few minutes, but then everything will work.

Manual way

Alternatively, you can try to identify what's working and what isn't.

To test whether pigz is working, when you type it as a command do you see something like this?

$ pigz
Usage: pigz [options] [files ...]
  will compress files in place, adding the suffix '.gz'. If no files are
  specified, stdin will be compressed to stdout. pigz does what gzip does,
  but spreads the work over multiple processors and cores when compressing.

I just checked the code inside Trim Galore, and it appears that it only checks for pigz if you selected multi-core trimming, but I cannot see this from the command line above. Did you specify -j at all? Another option could be to go for single-core processing, in which case the default should be gzip. Again, the bioconda installation should install pigz as well as igzip for faster trimming, so it would probably be the best option.

I just saw that your installation of Cutadapt (v2.10) is 3.5 years out of date, there is a good chance that updating Cutadapt will just fix it all. (see again the Mamba option).

Hope this helps!

kaushikr3 commented 9 months ago

Yup! updating cutadapt worked! I am closing this issue. Thank you!