MaayanLab / archs4

ARCHS4 RNA-seq processing scripts and web server pages.
Other
54 stars 10 forks source link

ENSG genes when using gene_symbols #34

Closed malonzm1 closed 4 months ago

malonzm1 commented 1 year ago

Hi,

When I use the R script (h5read(destination_file, "meta/genes/gene_symbol")) the matrix generated includes genes with prefix ENSG along with regular gene symbols. Why is this?

Thanks and good day.

lachmann12 commented 1 year ago

This is an issue with the ensembl annotation. Some genes do not have an official gene symbol. We use the ensembl id as a placeholder in this case.

malonzm1 commented 1 year ago

When I run kallisto with the script that generates identical output as elysium/archs4 there are no genes with prefix ENSG (the output with ENSG genes include > 60,000 genes while the output without ENSG genes include > 30,000 genes). Is it possible to make the output from kallisto the same (>60,000 genes)?

lachmann12 commented 4 months ago

you can use archs4py to mimic the output now