Closed DiegoZavallo closed 1 year ago
So I'm sorry it took me so long to respond. I am just catching up with open issues now.
Did you ever figure this out?
One thing I can say is that I would not use this strategy. Specifically, I strongly discourage any pre-filtering of the raw reads. If you only align reads of a certain length, then degraded RNA bits (like tRNA and rRNA chunks) will not be obvious .. all you will have left are the random bits that happened to be 20-24 nts long.
Yes, something clearly broke in the alignment step.
Hi Felix, I have a soybean sRNA-Seq data which I already used in the past with ShortStack to detected miRNAs. Now I want to run it without --locifile so it create the sRNAloci gff3 file to later use it for differential expresion analysis. First I "cat" all the fastq files into one big file using only 21-22 and 24nt reads length.
And then run ShortStack
ShortStack --readfile 5-All_21-22-24nt.fq --genomefile Glycine_max.Glycine_max_v2.1.dna.chromosome.fa --bowtie_cores 14 --mismatches 1 --bowtie_m all --outdir sRNA_loci/
I set
bowtie_m all
so it mapped also reads that have more than 50 multimapping sites to create a comprehensive annotation file for sRNAloci as I have already asked: https://github.com/MikeAxtell/ShortStack/issues/76It run all the way, but it seems that it only recongnize 1 read that mapped to the genome!!?
This is what I had:
I googled these errors and they seem to have to do with the headers in the files.
Oddly, when I looked at the ErrorLog it appear to have mapped correctly with more than 95% of the read
I attached the ErrorLog.txt and the Log.txt so you can see it. Log.txt ErrorLogs.txt
the headers of my fastq file seems ok, and I downloaded again the Glycine max genome but the results were the same.
Any thoughts?
Best
Diego