MikeAxtell / ShortStack

ShortStack: Comprehensive annotation and quantification of small RNA genes
MIT License
88 stars 29 forks source link

ShortStack on mRNA data? #149

Closed meerveld96 closed 3 months ago

meerveld96 commented 4 months ago

Hi Mike,

I got the following error and the ShortStack 4.0.3 stopped running:

Traceback (most recent call last): File "/lustre/BIF/nobackup/ruite093/miniconda3/envs/ShortStack4.0.3/bin/ShortTracks", line 282, in if last_pos < chr_sizes[last_chr]: KeyError: 'foobarbaz'

Candidates examined: 0it [00:00, ?it/s] Traceback (most recent call last): File "/lustre/BIF/nobackup/ruite093/miniconda3/envs/ShortStack4.0.3/bin/ShortStack", line 3619, in mir_qdata = mirna(args, merged_bam, fai, pmir_bedfile, read_count) File "/lustre/BIF/nobackup/ruite093/miniconda3/envs/ShortStack4.0.3/bin/ShortStack", line 2261, in mirna dn_q_bedlines, dn_locus_bedlines = zip(*denovo_mloci1) ValueError: not enough values to unpack (expected 2, got 0)

I'm wondering how this could happen, is there a maximum amount of sequences of input of ShortStack? I feed forward and reversed messenger RNA fq.gz files with a total size around 450 Gigabytes. I did ran ShortStack on the BAM files, gives the same problem. I ran also ShortStack on small RNAs in another experiment, this ran fluently.

Thanks,

Best regards, Stefan

MikeAxtell commented 3 months ago

You stated "... I feed forward and reversed messenger RNA fq.gz files...". ShortStack is designed for small RNA-seq, not mRNA-seq, data. Input(s) to option --readfile should be small RNA-seq FASTQ or FASTA data, forward strand, where the 1st nt of each read is the 1st nt of the biological RNA. Similarly, inputs to ShortStack's --bamfile option should be BAM files of small RNA-seq alignments. Using mRNA-seq data will break ShortStack, as seems to be the case here.

meerveld96 commented 3 months ago

Thanks for the clear answer, I will use only small RNA-seq reads for ShortStack.