stuart-lab / signac

R toolkit for the analysis of single-cell chromatin data
https://stuartlab.org/signac/
Other
325 stars 87 forks source link

Option to define column from genome annotation used in GeneActivity as gene name #837

Closed skilpinen closed 2 years ago

skilpinen commented 2 years ago

Currently, if I am not mistaken, GeneActivity() uses as forced default gene_name column from the genome annotation of chromatin assay as basis for gene nomenclature. This then causes problems with label transfer from scRNA if scRNA has been processed with Ensembl ids. Could there be option in in GeneActivity which specifies the column used as gene identification. Like in this case: GRanges object with 1763965 ranges and 5 metadata columns: seqnames ranges strand | tx_id gene_name gene_id gene_biotype type

| ENSMUSE00001236884 chr3 3508030-3508332 + | ENSMUST00000108393 Hnf4g ENSMUSG00000017688 protein_coding exon ENSMUSE00000676606 chr3 3634150-3634347 + | ENSMUST00000108394 Hnf4g ENSMUSG00000017688 protein_coding exon ENSMUSE00001345708 chr3 3638059-3638230 + | ENSMUST00000108393 Hnf4g ENSMUSG00000017688 protein_coding exon There is ENSMUSG id available in the same annotation data.
timoast commented 2 years ago

This is a good idea and should be straightforward to implement, I'd welcome a PR if anyone would like to contribute this

timoast commented 2 years ago

I've now added a gene.id parameter to GeneActivity() to enable naming the genes using gene IDs rather than gene names

skilpinen commented 2 years ago

Thanks, I should have asked for PR and do that as well, but just too many open things at the moment. Anyway, thanks a lot, this is really phenomenal library.