nf-core / nanoseq

Nanopore demultiplexing, QC and alignment pipeline
https://nf-co.re/nanoseq
MIT License
179 stars 80 forks source link

Stringtie and featurecounts results: counts_gene.txt has more rows than counts_transcript.txt #265

Open CHENAO-QIAN opened 7 months ago

CHENAO-QIAN commented 7 months ago

Description of the bug

I am running the pipeline with stringtie and featurecounts option. In stringtie2/featureCounts/, I got the counts table for genes and transcripts. However, counts_gene.txt has more rows than counts_transcript.txt. In counts_gene.txt, one MSTRG (stringtie gene) has several rows. I am wondering if counts_gene.txt is actually reporting exon counts.

Checking the featurecounts code in the pipeline, how were the parameters chosen/optimized?

featureCounts \\
        -L \\
        -O \\
        -f \\
        -g gene_id \\
        -t exon \\
        -T $task.cpus \\
        -a $gtf \\
        -o counts_gene.txt \\
        $bams

    featureCounts \\
        -L \\
        -O \\
        -f \\
        --primary \\
        --fraction \\
        -F GTF \\
        -g transcript_id \\
        -t transcript \\
        --extraAttributes gene_id \\
        -T $task.cpus \\
        -a $gtf \\
        -o counts_transcript.txt \\
        $bams

Thanks!

Command used and terminal output

Nextflow run nf-core/nanoseq -r 3.1.0 -profile singularity --input samples.csv --protocol cDNA --skip_demultiplexing --aligner mininap2 --quantification_method stringtie2 --skip_fusion_analysis --skip_differential_analysis 

No error.

Relevant files

No response

System information

Nextflow 23.04.1 HPC Cmd Singularity CentOS nanoseq 3.1.0