small RNA-seq - Githubissues

hyunhwan-jeong / SalmonTE

SalmonTE is an ultra-Fast and Scalable Quantification Pipeline of Transpose Element (TE) Abundances

GNU General Public License v3.0

80 stars 23 forks source link

small RNA-seq #6

Closed adomingues closed 5 years ago

adomingues commented 6 years ago

I did not find in the manuscript, but would it make sense to use SalmonTE to quantify expression of small RNAs? Or is there a reason not to do it?

My use case is unusual, since it would be quantification of which transposons are being targeted by piRNAs, so not a traditional gene/transcript expression quantification.

I am willing to test it, I just wanted to know if there any arguments against it before even starting.

Thank you.

hyunhwan-jeong commented 6 years ago

Hi @adomingues,

Could you let me know how small those RNAs? If the length is larger than 30bp then you can use SalmonTE for the case. Otherwise, I think we must change the length of k-mer. I think it helps.
To make sure, are your reference sequences pi-RNAs?

Thank you,

Hwan

adomingues commented 6 years ago

Hi @hyunhwaj,

Could you let me know how small those RNAs?

In the range of 26-29 bp. I will adjust the k-mer size accordingly.

To make sure, are your reference sequences pi-RNAs?

(sorry if this is getting too heavy/focused on the biogenesis of piRNAs)

The reference sequences will be the repeat annotations you provided earlier. Whilst primary piRNAs are derived from genomic clusters, these then get amplified in the ping-pong pathway through cleavage of single-stranded transcripts, mostly transcribed repeat elements (panel b in the figure bellow). Since (i) defining those primary piRNA clusters is not straightforward, and (ii) my goal is to find which transposons are being targeted, using the transposon reference sequences seems like the way to go.

Anyway, I will test it and see how it goes.

f2 large ref

hyunhwan-jeong commented 6 years ago

@adomingues,

It has been a while you commented regarding my questions. I am wondering SalmonTE does work for your case. I also have another question to follow up this issue. As I understood, there are any sequence of transposable elements inside of your sequence data, do I understand this correctly? Does your data contain piRNA sequence?

Many Thanks!

Hyun-Hwan Jeong

adomingues commented 6 years ago

Hi @hyunhwaj , to be honest since the approach is untested for small RNA-seq, and I had tons of projects coming my way (still do), I ended up not playing with it and stuck to the tool I was previously using for this type of analysis.

As I understood, there are any sequence of transposable elements inside of your sequence data, do I understand this correctly? Does your data contain piRNA sequence?

Yes to both. piRNAs are just small RNA sequences that target tranposons with roughly 100% complementarity. They can also be sense or antisense to the transposons. One could just map small RNA-seq data to the genome and then intersect the read mapping locations with the annotated transposon locations to get a (very) rough idea which transposons are being targeted. Think of it like small pieces of fragmented transposon sequences. This approach of course ignores all the multimapping issues that come alone with transposon related analysis :)

weedcentipede commented 4 years ago

Hello there, I was wondering about this exact same issue, Anyone has took a look?

Cheers, Luis Alfonso.