TCP-Lab / SeqLoader

Constructors and methods for `xSeries` and `xModel` S3 classes
0 stars 0 forks source link

Provide a rule to collapse expression of genes with the same key #6

Open Feat-FeAR opened 3 months ago

Feat-FeAR commented 3 months ago

In subsetGenes.xSeries, when, for a given key value (e.g., gene symbol), more than one occurrence is found in the annotation table of the xSeries (i.e., the same duplicated SYMBOL for multiple ENSG IDs) no action is taken and each entry is assigned to a different bar in the final chart, even if its name is not unique. It should be better to provide a rule to collapse statistics of genes with the same key value (e.g., keeping only the most expressed or their sum...), provided that user's "subsetting intention" is based on geneset key, rather than on (unique) ENSG IDs... See the same problem in GOZER

DEGs <- DEGs[order(DEGs$adj_pval), ]
DEGs <- DEGs[!duplicated(DEGs$GeneSymbol), ]
Feat-FeAR commented 2 months ago

Importantly, this really happens even within the restricted geneset of ion channels, as of org.Hs.eg.db v.3.18.0 (Bioconductor version 3.18 - BiocManager 1.30.23):

KCNMB2

ENSG00000197584 KCNMB2  potassium calcium-activated channel subfamily M regulatory beta subunit 2
ENSG00000275163 KCNMB2  potassium calcium-activated channel subfamily M regulatory beta subunit 2

AQP12

ENSG00000184945 AQP12A  aquaporin 12A   protein-coding
ENSG00000185176 AQP12A,AQP12B   aquaporin 12A,aquaporin 12B     protein-coding