GoekeLab / bambu

Reference-guided transcript discovery and quantification for long read RNA-Seq data
GNU General Public License v3.0
190 stars 24 forks source link

inconsistent gene and transcript count #440

Closed alexyfyf closed 4 months ago

alexyfyf commented 4 months ago

Hi Ying,

I was using bambu 3.3.4 and found some of the transcripts and gene count did not match. I updated to the new version 3.6.0 from bioconductor, and the issue is still there.

se1 <- bambu(reads = bam.files, 
             annotations = bambuAnnotations, 
             stranded = T, 
             genome = fa.file, ncore = 32)
seGene <- transcriptToGeneExpression(se1)

If I then show the count for transcripts and gene for ENSG00000270405.1

assays(seGene)$counts[rowData(seGene)$GENEID == 'ENSG00000270405.1',]
barcode01_10m_sorted barcode02_10m_sorted barcode03_10m_sorted barcode04_10m_sorted 
                 274                  279                  256                    0 
barcode05_10m_sorted barcode06_10m_sorted 
                   0                    0
assays(se1)$CPM[rowData(se1)$GENEID == 'ENSG00000270405.1',]
barcode01_10m_sorted barcode02_10m_sorted barcode03_10m_sorted barcode04_10m_sorted 
                   0                    0                    0                    0 
barcode05_10m_sorted barcode06_10m_sorted 
                   0                    0 

The transcript count does not add up to the gene count. I have also checked the txt output using writeBambuOutput and confirmed the same issue in the counts_genes.txt and counts_transcript.txt. Is this something known? Or something I can check myself?

Thank you so much.

Cheers, Alex

andredsim commented 4 months ago

Hi,

Not Chen Ying here, but you may want to include the incompatible counts which are found in the metadata of se1, These are read counts that could not be associated to any transcript belonging to the gene, but would still be included in the gene count.

Let us know if that resolves the inconsistency.

Kind Regards, Andre Sim

alexyfyf commented 4 months ago

@andredsim , thank you so much, seems this is the case.