Xinglab / rmats-turbo

Other
209 stars 49 forks source link

adaptor trimming #324

Open rezarahman12 opened 9 months ago

rezarahman12 commented 9 months ago

Dear rMATs team,

I used rMATS-turbo using FASTQ files for the detection of differential alternative splicing. The FASTQ file had the standard Illumina adaptor for paired-end sequencing. I did not trim the adaptor before passing the FASTQ file in rMATS. Is it acceptable or a problem for the detection of differential alternative splicing by rMATS?

I appreciate your time and help.

Kind regards Reza

EricKutschera commented 9 months ago

If rMATS is run with FASTQ input files and without --allow-clipping then STAR will be run with --alignEndsType EndToEnd: https://github.com/Xinglab/rmats-turbo/blob/v4.1.2/rmats.py#L67 In that case STAR will not be allowed to use clipping in the alignments and the resulting alignments will probably not work well with rMATS

If rMATS is run with FASTQ input files and --allow-clipping then the adapter sequence may be clipped in the alignment and I would expect rMATS to be able to use the alignments

rezarahman12 commented 9 months ago

Thanks for your quick reply. I've not used --allow-clipping, however, rMATS provided the results. Is that okay?

EricKutschera commented 9 months ago

If rMATS was able to produce output then I think it's ok to use that output. There should have been a section showing the number of reads used or filtered in the printed output and that also should be written to a file in the --tmp directory like [datetime]_read_outcomes_by_bam.txt. It might be good just to check that most of the reads are shown as "USED"

rezarahman12 commented 9 months ago

Thanks again. I took a look and found the majority (~88%) of the total reads from bam had been used by rMATS. Please see below details of one of them- /out/2023-07-29-15_39_26_063734_bam1_1/Aligned.sortedByCoord.out.bam USED: 128142543 NOT_PAIRED: 0 NOT_NH_1: 13309072 NOT_EXPECTED_CIGAR: 2239234 NOT_EXPECTED_READ_LENGTH: 0 NOT_EXPECTED_STRAND: 0 EXON_NOT_MATCHED_TO_ANNOTATION: 2546201 JUNCTION_NOT_MATCHED_TO_ANNOTATION: 198728 CLIPPED: 0 TOTAL_FOR_BAM: 146435778