carmonalab / ProjecTILs

Interpretation of cell states using reference single-cell maps
GNU General Public License v3.0
231 stars 27 forks source link

ProjecTILs::Hs2Mm.convert.table generation #39

Closed Close-your-eyes closed 1 year ago

Close-your-eyes commented 2 years ago

Dear all,

really exciting package to me! I am now exploring its functionalities. I am checking arbitrary details of the functions and I wonder: How to generate a conversion table for mouse <----> human gene orthologs elegantly? Did you do it with Biomart, somehow?

I just noticed that ZNF683 (human) is not included in the default table but a mouse ortholog is described: Znf683. I did not conduct a systematic check for a large number of genes, yet. Maybe the table is just outdated. So I wonder how generate my own. I found this one on GitHub: https://github.com/vitkl/orthologsBioMART.

But anyhow: How did you generate your table? Can you tell us?

Thanks, yours, Chris.

mass-a commented 2 years ago

Hello!

yes we used the web interface for BioMart (http://www.ensembl.org/info/data/biomart/index.html) to download all genes with a known ortholog between mouse and human. We did the database dump a couple of years ago, so it's possible that some genes are missing.

I know there is also a BiomaRt package in bioconductor that allows interacting with the database – probably this would be a more elegant way of generating a table of orthologs.

Best -massimo

Close-your-eyes commented 2 years ago

Thank you.

I made a script which may allow to produce such table directly within R:

https://gist.github.com/Close-your-eyes/803c080e400e5626a0f4f68fa87b517b

However, ZNF683 was not mapped using this script.

Close-your-eyes commented 2 years ago

It may be beneficial to combine the table from your package with a one generated on one's own. I noticed either one has unique entries. E.g. Ccl5 is not mapped as an ortholog to CCL5 by bioMart (as used in the script). This is consistent with what can be found on the ensemble website. But by googling it, it seems as if Ccl5 in mice is the same as CCL5 in human. In your table though Ccl5 is included.