williamritchie / IRFinder

Detecting intron retention from RNA-Seq experiments
53 stars 25 forks source link

confusion with input type #148

Closed me37uday closed 3 years ago

me37uday commented 3 years ago

Hi,

What exactly does this mean? It is kind of contradicting each other? Sorted by name but unsorted? I'm sorry I didn't really get it.

#WARNING: for paired-end BAM input, it MUST be sorted by name (a.k.a unsorted).

dg520 commented 3 years ago

@me37uday No, it is not. The so called unsorted in STAR is to NOT sort output file by coordinates. It outputs alignment in the same order of FASTQ, which is actually sorted by names.

me37uday commented 3 years ago

Thanks for your response.

You mean to say that all bam files generated after alignment using STAR are already sorted by names?

dg520 commented 3 years ago

@me37uday If you haven't explicitly turned on the sort option, the answer is yes.

Generally speaking, sort refers to coordinates sorting. On the other hand, unsort refers to name sorting according to the FASTQ order (and a pairs of reads will come together in the output if the library is pair-end).

me37uday commented 3 years ago

Thanks for the detailed explanation @dg520 :)

Personally, I don't always sort during alignment to avoid longer runtime and the high resource requirement. Hence, the confusion. However, I did try to identify retained introns based on certain filters using both types of bam files (sorted by coordinate and name). I was able to call more introns retained using the latter.

Once again, thanks for this amazing tool :)

Cheers, Uday

dg520 commented 3 years ago

@me37uday Just a headup: It's not about how many IR events detected or how strong each IR signal is. Quantification result on a sorted pair-end BAM file is simply WRONG. For pair-end library, you MUST use an unsorted BAM.

me37uday commented 3 years ago

@dg520 I used the --unsorted parameter during the STAR alignment. You reckon I shouldn't sort the resulting bam file based on name before using IRFinder?

dg520 commented 3 years ago

@me37uday I don't know STAR has a parameter named exactly as --unsorted. But you can keep both sorted and unsorted BAM. To use IRFinder, you have to provide the unsorted BAM. You can use the sorted one for other purposes.

dg520 commented 3 years ago

@me37uday It's OK to do an extra name-sorting step after you got the STAR alignment to be sure, although it's not necessary.

me37uday commented 3 years ago

Alright, good to know. Thanks @dg520 :)