FertigLab / dominoSignal

A software package for connecting cell level features in single cell RNA sequencing data with receptor ligand activity.
https://fertiglab.github.io/dominoSignal/
GNU General Public License v3.0
3 stars 4 forks source link

mouse dataset gene conversion #134

Closed mindykimgraham closed 1 week ago

mindykimgraham commented 1 week ago

In the previous version of create_domino (Domino v1.0), there was an argument that would allow a user to convert human to mouse genes gene_conv = c("HGNC", "MGI"). I tried this argument in the updated version, hoping it was carried over to dominoSignal. It seems gene_conv is no longer included. Perhaps it is under a different name? I tried checking the documentation, but I didn't see anything.

jmitchell81 commented 1 week ago

Thank you for your question @mindykimgraham. That is correct that we depreciated the gene_conv argument to create domino as we updated the package to dominoSignal. In the older design, create_domino would use functions from biomaRt to access lists of gene orthologs. However, we found that the site hosting these conversion tables was not being maintained, and decided that maintaining the most current method to find orthologs and resolve multiple ortholog mapping within this package was not sustainable.

There is an alternative approach to ortholog conversion you can carry out through other changes we've made to the package. We switched from only being able to use CellPhoneDB as a ligand-receptor pair database to a more universal format we call the rl_map, which is a data.frame where each row describes a possible ligand-receptor interaction. We still have a helper function for ease of using the CellPhoneDB database described in our getting started vignette. With the genes encoding ligands and receptors having their own columns in the rl_map, you can use any method you prefer to convert these genes to orthologs in other organisms. You can also use completely different ligand-receptor databases that are curated for the organism you are working on, so long as you format the database to match the rl_map format.

My apologies for not making this more clear in the documentation. We'll be sure to rectify that. Please let me know if you have any other questions about using the package.

mindykimgraham commented 1 week ago

Thank you for your thorough response. I have generated a workaround and am sharing it here in case folks are also interested in using CellPhoneDB for their mouse data.

`

convert genes from CellPhoneDB to mouse

library(biomaRt) genes <- read.csv("./cellphoneDB/gene_input.csv", stringsAsFactors = FALSE)

Set up the human and mouse datasets using the specified Ensembl archive host

human <- useMart("ensembl", dataset = "hsapiens_gene_ensembl", host = "https://dec2021.archive.ensembl.org/") mouse <- useMart("ensembl", dataset = "mmusculus_gene_ensembl", host = "https://dec2021.archive.ensembl.org/")

map human genes to mouse genes, use getLDS function to map human gene names to mouse gene names

conversion <- getLDS(attributes = c("hgnc_symbol"), filters = "hgnc_symbol", values = genes$hgnc_symbol, mart = human, attributesL = c("mgi_symbol"), martL = mouse, uniqueRows = TRUE)

Merge the conversion data with your original dataframe

genes_converted <- merge(genes, conversion, by.x = "hgnc_symbol", by.y = "HGNC.symbol", all.x = TRUE)

Inspect the result

head(genes_converted) genes_converted <- genes_converted[-2] colnames(genes_converted)[4] <- "gene_name" genes_converted <- genes_converted[c(4, 2, 1, 3)]

confirm dataframe mirrors original

head(genes) head(genes_converted)

create map

rl_map <- create_rl_map_cellphonedb(genes = genes_converted, proteins = proteins, interactions = interactions, complexes = complexes, database_name = "CellPhoneDB_v5.0" # database version used )

knitr::kable(head(rl_map))`