Closed sr320 closed 6 months ago
NOTE: ATM, not sure what the distinction is between the three FastAs. Filenames provide a clue, but the documentation doesn't seem to provide specific explanations.
NOTE: FastA contains matches miRNAs identified. Does NOT contain predicted miRNAs!!
NOTE: ATM, not sure what the distinction is between the three FastAs. Filenames provide a clue, but the documentation doesn't seem to provide specific explanations.
NOTE: FastA contains matches miRNAs identified. Does NOT contain predicted miRNAs!!
NOTE: ATM, not sure what the distinction is between the three FastAs. Filenames provide a clue, but the documentation doesn't seem to provide specific explanations.
NOTE: FastA contains matches miRNAs identified. Does NOT contain predicted miRNAs!!
EDITED: Update that ShorStack FastAs do not have predicted miRNAs.
Just going to add this here for later discussion
File Path | Format | Type | Num Seqs | Sum Len | Min Len | Avg Len | Max Len |
---|---|---|---|---|---|---|---|
../output/11.1-Apul-sRNAseq-miRdeep2-31bp-fastp-merged//mirna_results_03_04_2024_t_13_00_39/novel_mature_03_04_2024_t_13_00_39_score-50_to_na.fa | FASTA | DNA | 896 | 19,310 | 17 | 21.6 | 25 |
../output/11.1-Apul-sRNAseq-miRdeep2-31bp-fastp-merged//mirna_results_03_04_2024_t_13_00_39/novel_pres_03_04_2024_t_13_00_39_score-50_to_na.fa | FASTA | DNA | 896 | 54,056 | 35 | 60.3 | 110 |
../output/11.1-Apul-sRNAseq-miRdeep2-31bp-fastp-merged//mirna_results_03_04_2024_t_13_00_39/novel_star_03_04_2024_t_13_00_39_score-50_to_na.fa | FASTA | DNA | 896 | 19,339 | 13 | 21.6 | 31 |
../output/13.2.1-Apul-sRNAseq-ShortStack-31bp-fastp-merged-cnidarian_miRBase/ShortStack_out/mir.fasta | FASTA | DNA | 114 | 5,281 | 21 | 46.3 | 98 |
I've updated my previous post to indicate that the ShortStack FastAs do not contain predicted miRNAs.
For predicted miRNAs, we'll have to use the Results.gff3
(e.g. https://github.com/urol-e5/deep-dive/blob/main/E-Peve/output/08.2-Peve-sRNAseq-ShortStack-31bp-fastp-merged/ShortStack_out/Results.gff3) to extract FastA.
But, it would probably also be a good idea to only extract predicted miRNAs which have some minimum threshold of read alignments (can be found in column 6).
What value is column six? That is, how do we filter? Numbers vary in orders of magnitude.
What value is column six?
read alignments (can be found in column 6).
Sorry, what do you mean by "how do we filter?"
We'd have to come up with a number of reads which we think provides sufficient support to decide whether or not a predicted miRNA locus is accurate or not.
sorry I thought that was a confidence score, did not realize was number of reads.
Which is odd as you clearly stated "read alignments (can be found in column 6)" :)
for each species, denoting which are database matches and which are denovo. Preferably in repo if size allows.