gao-lab / GLUE

Graph-linked unified embedding for single-cell multi-omics data integration
MIT License
365 stars 56 forks source link

get_gene_annotation returns Nan #118

Open fafa92 opened 5 months ago

fafa92 commented 5 months ago

Hi,

Thanks for the great package. I'm trying to follow step 1 and when I get to

scglue.data.get_gene_annotation(
    rna, gtf="gencode.vM25.chr_patch_hapl_scaff.annotation.gtf.gz",
    gtf_by="gene_name"
)

it would return Nan for all values and every gene. I looked up my RNA-seq file and it seems there are many genes in it that already have references in gtf file. I was wondering what might cause the issue. I ran with chenRNA data and everything looks fine.

P.S.: Every step before that looks fine.

I appreciate any help in advance. Thanks!

Jeff1995 commented 4 months ago

Hi @fafa92! Thanks for your interest in GLUE! Could you please post the content of rna.var before running get_gene_annotation? By default it joins the GTF "gene_name" on the index of rna.var, i.e., rna.var_names. If your gene names matching GTF "gene_name" resides in another column, you would need to specify the column as var_by in get_gene_annotation.

TuDou-PK commented 1 month ago

@fafa92 Hi, I faced the same problem, have you solved that? @Jeff1995 Hi, I checked the rna.var, and it looks doesn't have any problems. Is there any other possible solution? I post my rna.var below, many thanks:

image

TuDou-PK commented 1 month ago

a, the problem solved, "gencode.vM25.chr_patch_hapl_scaff.annotation.gtf.gz" is from mouse gene, I use human gene, that's why all gene is NaN