Open beginner984 opened 2 years ago
@beginner984 Thank you for your question!
Searching for lncRNAs in RNAcentral is indeed not straightforward. My colleague @blakesweeney might be able to provide a more specific advice, but in general I would treat lncRNA and lincRNA as the same class to be on the safe side, as some lncRNAs could be incorrectly classified as lincRNAs and vice versa. I would also suggest not to use Rfam sequences if you are interested in lncRNAs, as Rfam does not focus on lncRNAs.
With respect to your question about why you observe such a high % of lncRNAs in your sample, that's difficult to answer without having more information, and the RNAcentral team cannot provide input on specific research projects. I would suggest spot-checking some of these lncRNA entries and see if you notice any pattern. It could be a misannotation, and those sequences are not actually lncRNAs, or it could be that your short sequences happen to overlap these lncRNAs by chance.
I hope this helps!
Indeed, this is not an issue with RNAcentral, rather I need help with some intuition please
We have exosome-sequensing (from plasma). In raw read counts file, I see 72650 gene names
This is hoe my read count file looks like
I have created a percentage bar chart for categories of RNAs annotated in this exosome-seq like
Which category (RNA type) I should consider as long non-coding RNA (lncRNA) ?
Can I consider this observed24% Long intergenic non-coding RNA (lincRNA) (sense+antisense) as long non-coding RNA (lncRNA) ?
But as I read Generally speaking we don’t expect much lncRNA/mRNA in plasma and much of that will be heavily fragmented which makes it very difficult to sequence. So how I see 24% of lincRNAs ?
If this was your data, which type of RNAs here you would considered as long non-coding RNA (lncRNA) ?
In RNAcentral, I see this
In Rfam part I could not find any lncRNAs
Am I right in searching?
Thanks for any intuition