NUStatBioinfo / DegNorm

Normalizing RNA degradation in RNA-seq data
https://nustatbioinfo.github.io/DegNorm/
3 stars 1 forks source link

Allow user to identify gene names by `gene_id` or `gene_name` #40

Open ffineis opened 4 years ago

ffineis commented 4 years ago

GenomeAnnotationLoader is currently prioritizing gene_name over gene_id (https://github.com/NUStatBioinfo/DegNorm/blob/master/degnorm/loaders.py#L151)

In response to user comment


Our .gtf file (gencode gtf) has multiple gene IDs mapping to the same gene symbol. For example,

ENSG00000284934.1 DIABLO

ENSG00000184047.19 DIABLO

However, DegNorm output file combines both IDs and generates adjusted counts for DIABLO gene rather than both IDs. Is it possible to obtain adjusted counts per gene ID?```