FelixKrueger / TrimGalore

A wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data
GNU General Public License v3.0
461 stars 150 forks source link

how to set the input files with absolute path #97

Closed WellJoea closed 4 years ago

WellJoea commented 4 years ago

When I set the absolute path in input files, I get the error. I think it is better for using absolute path in input files when the command shell is not in the fasta folder

FelixKrueger commented 4 years ago

Hi @WellJoea

It should be possible to give the absolute path to the input files, like so:

trim_galore /path/to/input_fastq_files/file.fastq.gz

What exactly is the problem in your case?

WellJoea commented 4 years ago

when I uese the --polyA , I got the error like this:

POLY-A TRIMMING MODE; EXPERIMENTAL!!

Now performing Poly-A trimming for the adapter sequence: 'AAGCAGTGGTATCAACGCAGAG' from file /data/zhouwei/02production/20200721_1717/GC01-R1/GC01-R1_Q801602_L003_R1_001.fastq.gz <<< gzip: /data/zhouwei/02production/20200721_1717/GC01-R1/TrimGalore//data/zhouwei/02production/20200721_1717/GC01-R1/GC01-R1_Q801602_L003_R1_001.fastq.gz: No such file or directory This is cutadapt 2.10 with Python 3.8.3 Command line parameters: -j 7 -e 0.1 -O 1 -a AAGCAGTGGTATCAACGCAGAG /data/zhouwei/02production/20200721_1717/GC01-R1/TrimGalore//data/zhouwei/02production/20200721_1717/GC01-R1/GC01-R1_Q801602_L003_R1_001.fastq.gz

The commend line : trim_galore \ --fastqc \ --fastqc_args "-t 20 --java $JAVA --outdir $OU " \ -a AAGCAGTGGTATCAACGCAGAG \ -a2 AAGCAGTGGTATCAACGCAGAG \ -q 20 \ --length 20 \ -j 7 \ --trim-n \ --path_to_cutadapt $cutadapt \ --polyA \ --paired \ --retain_unpaired \ -r1 35 \ -r2 35 \ --phred33 \ --basename $ID \ -o $OU \ $IN/GC01-R1_Q801602L003*_001.fastq.gz

FelixKrueger commented 4 years ago

That error is in fact a warning message only. The option --polyA is supposed to be used only for a special kit by ThermoFisher (Collibri PolyA kit, https://www.thermofisher.com/order/catalog/product/A38110024#/A38110024). Is that what you have?

--polyA                 This is a new, still experimental, trimming mode to identify and remove poly-A tails from sequences.
                        When --polyA is selected, Trim Galore attempts to identify from the first supplied sample whether
                        sequences contain more often a stretch of either 'AAAAAAAAAA' or 'TTTTTTTTTT'. This determines
                        if Read 1 of a paired-end end file, or single-end files, are trimmed for PolyA or PolyT. In case of
                        paired-end sequencing, Read2 is trimmed for the complementary base from the start of the reads. The
                        auto-detection uses a default of A{20} for Read1 (3'-end trimming) and T{150} for Read2 (5'-end trimming).
                        These values may be changed manually using the options -a and -a2.

                        In addition to trimming the sequences, white spaces are replaced with _ and it records in the read ID
                        how many bases were trimmed so it can later be used to identify PolyA trimmed sequences. This is currently done
                        by writing tags to both the start ("32:A:") and end ("_PolyA:32") of the reads in the following example:

                        @READ-ID:1:1102:22039:36996 1:N:0:CCTAATCC
                        GCCTAAGGAAACAAGTACACTCCACACATGCATAAAGGAAATCAAATGTTATTTTTAAGAAAATGGAAAATAAAAACTTTATAAACACCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

                        @32:A:READ-ID:1:1102:22039:36996_1:N:0:CCTAATCC_PolyA:32
                        GCCTAAGGAAACAAGTACACTCCACACATGCATAAAGGAAATCAAATGTTATTTTTAAGAAAATGGAAAATAAAAACTTTATAAACACC

                        PLEASE NOTE: The poly-A trimming mode expects that sequences were both adapter and quality trimmed
                        before looking for Poly-A tails, and it is the user's responsibility to carry out an initial round of
                        trimming. The following sequence:

                        1) trim_galore file.fastq.gz
                        2) trim_galore --polyA file_trimmed.fq.gz
                        3) zcat file_trimmed_trimmed.fq.gz | grep -A 3 PolyA | grep -v ^-- > PolyA_trimmed.fastq

                        Will 1) trim qualities and Illumina adapter contamination, 2) find and remove PolyA contamination.
                        Finally, if desired, 3) will specifically find PolyA trimmed sequences to a specific FastQ file of your choice.

Just generally, you seem to be setting every single parameter that Trim Galore has... Under normal settings, a command like this:

trim_galore --paired *fastq.gz

Would do pretty much anything you need to do. If you have an adapter that is different to the Illumina or Nextera primers but you want to use the same adapter for both R1 and R2, it is sufficient to specify the sequence a single time with -a AAGCAGTGGTATCAACGCAGAG.

WellJoea commented 4 years ago

I got it and soved the problem. Think you very much!!!

FelixKrueger commented 4 years ago

Excellent, good luck!