mhammell-laboratory / TElocal

A package for quantifying transposable elements at a locus level for RNAseq datasets.
GNU General Public License v3.0
21 stars 8 forks source link

Some elements have more reads in unique mode vs multi mode #44

Open R-Najjar opened 2 months ago

R-Najjar commented 2 months ago

Hi, I ran TElocal in multi and uniq modes to get an idea on how much read support was coming from unique vs multimapping reads, and I found some elements with more reads in unique mode than multi mode. How is this possible? In other words, why weren't these unique reads counted in multi mode, which should count all reads, correct? This happened in 47 out of out of 657 elements that I was studying from 70 samples. The differences ranged from 1-4 reads.

Thanks, Rayan

olivertam commented 2 months ago

Hi,

Could you provide the command lines that you used for TElocal? Can you also describe how you aligned the reads.

Thanks.

Thanks.

R-Najjar commented 2 months ago

Hi Oliver, Thank you. I used STAR for alignment

STAR --runThreadN 24 --genomeDir /t2t/nw/star --runMode alignReads --readFilesIn ${SAMP}.1.fastq.gz ${SAMP}.2.fastq.gz --readFilesCommand zcat --outFileNamePrefix /t2t/nw/bams/${SAMP} --outFilterMultimapNmax 100 --winAnchorMultimapNmax 100 --twopassMode Basic --outSAMtype BAM Unsorted

And here are the two TElocal runs

apptainer exec --bind /gscratch telocal.sif TElocal --mode uniq --stranded reverse --project /t2t/nw/sines/local_uniq_${SAMP} -b ${SAMP}Aligned.out.bam --GTF /t2t/chm13v2.0_RefSeq.gtf --TE /t2t/T2T-CHM13v2_rmsk_TE.gtf.locInd

apptainer exec --bind /gscratch telocal.sif TElocal --mode multi --stranded reverse --project /t2t/nw/sines/local_multi_${SAMP} -b ${SAMP}Aligned.out.bam --GTF /t2t/chm13v2.0_RefSeq.gtf --TE /t2t/T2T-CHM13v2_rmsk_TE.gtf.locInd
olivertam commented 2 months ago

Hi,

I was able to reproduce your issue. We're taking a closer look at it.

Thanks.

R-Najjar commented 1 month ago

I appreciate it. Please let me know what you find. I checked if the same happens in TEtranscripts, and it does, it's not common, but it does happen. I found 27 occurrences in 21 elements, with a range of difference of 1-9 between unique and multi reads. uniq.more.than.multi.csv

Thanks

R-Najjar commented 1 month ago

Hi Oliver, do you have an update on this? I'd appreciate it Thank you

olivertam commented 1 month ago

Hi,

We are still pinpointing the possible source.

Thanks.