nf-core / scrnaseq

A single-cell RNAseq pipeline for 10X genomics data
https://nf-co.re/scrnaseq
MIT License
204 stars 165 forks source link

Standardize h5ad/seurat var gene annotations across aligners #363

Open alexblaessle opened 1 month ago

alexblaessle commented 1 month ago

Description of feature

Dear all

I just found out that in commit https://github.com/nf-core/scrnaseq/commit/aae5fc5f618d1a4ccd40428d14050d838876f7fe we moved away from having ensembl_gene_ids as standard feature identifier for kallisto.

Even though gene names are generally desired by users, they are not necessarily persistent identifiers. I would suggest that each aligner outputs an h5ad/seurat with ensembl_gene_id as var index and an additional column "gene_symbol" or "gene_name".

Tagging @grst @fmalmeida @apeltzer

grst commented 1 month ago

I'm all for it and this ties in with #310. I don't have capacity to tackle this myself right now, but if you task @fmalmeida or other ZS folks I'm happy to provide input.