gpertea / stringtie

Transcript assembly and quantification for RNA-Seq
MIT License
377 stars 78 forks source link

Issues with -M parameter #324

Open bvaldebenitom opened 3 years ago

bvaldebenitom commented 3 years ago

Hi!

I've been doing some tests with 1 transcript fully mapped by multi-hit/multi-mapped reads, and found some issues with the -M parameter.

Command used: stringtie-2.1.5.OSX_x86_64/stringtie bundle_debug_20210313.bam -o exp2_test1.gtf -A exp2_test1_abund.txt -v -M 1

Output:

Running StringTie 2.1.5. Command line:
stringtie-2.1.5.OSX_x86_64/stringtie bundle_debug_20210313.bam -o exp2_test1.gtf -A exp2_test1_abund.txt -v -M 1
Default stack size for threads: 524288 (increased to 8388608)
[03/13 11:24:05]>bundle RANDOMExp2:1002-7469 [23784 alignments (6247 distinct), 0 junctions, 0 guides] begins processing...
[03/13 11:24:05]^bundle RANDOMExp2:1002-7469 done (0 processed potential transcripts).
[03/13 11:24:05] All threads finished.
Total count of aligned fragments: 3467.07
Fragment coverage length: 70

Contents of GTF file:

$ cat exp2_test1.gtf
# stringtie-2.1.5.OSX_x86_64/stringtie bundle_debug_20210313.bam -o exp2_test1.gtf -A exp2_test1_abund.txt -v -M 1
# StringTie version 2.1.5

Command used: stringtie-2.1.5.OSX_x86_64/stringtie bundle_debug_20210313.bam -o exp2_test1.gtf -A exp2_test1_abund.txt -v -M 1.12

Output:

Running StringTie 2.1.5. Command line:
stringtie-2.1.5.OSX_x86_64/stringtie bundle_debug_20210313.bam -o exp2_test1.gtf -A exp2_test1_abund.txt -v -M 1.12
Default stack size for threads: 524288 (increased to 8388608)
[03/13 11:25:51]>bundle RANDOMExp2:1002-7469 [23784 alignments (6247 distinct), 0 junctions, 0 guides] begins processing...
[03/13 11:25:51]^bundle RANDOMExp2:1002-7469 done (1 processed potential transcripts).
[03/13 11:25:51] All threads finished.
Total count of aligned fragments: 3467.07
Fragment coverage length: 70

Contents of GTF file:

$ cat exp2_test1.gtf
# stringtie-2.1.5.OSX_x86_64/stringtie bundle_debug_20210313.bam -o exp2_test1.gtf -A exp2_test1_abund.txt -v -M 1.12
# StringTie version 2.1.5
RANDOMExp2  StringTie   transcript  1002    7469    1000    .   .   gene_id "STRG.1"; transcript_id "STRG.1.1"; cov "37.522293"; FPKM "154607.062500"; TPM "999999.875000";
RANDOMExp2  StringTie   exon    1002    7469    1000    .   .   gene_id "STRG.1"; transcript_id "STRG.1.1"; exon_number "1"; cov "37.522293";

Attached is the bundle_debug_20210313.bam file used.

I thought that the -M ranged from 0 (only uniquely mapped reads in bundle) to 1 (bundle consisting only of multi-mapped reads). Was this a misunderstanding on my part, or are cases of bundleds whose value is expected to be greater than 1?

Thanks!

bundle_debug_20210313.tar.gz