msraredon / NICHES

Niche Interactions and Cellular Heterogeneity in Extracellular Signaling
https://msraredon.github.io/NICHES/
49 stars 15 forks source link

omnipath_download() failed #3

Closed bhavyaac closed 2 years ago

bhavyaac commented 2 years ago

Hi, thank you for your hard work on this package.

I am currently trying to run the function RunNICHES(), and I am getting an error related to retrieving data from Omnipath.

niches_results <- RunNICHES(seurat,
                 assay = 'RNA',
                 species = 'mouse',
                 LR.database = 'omnipath',
                 CellToCell = T)

Error in (function (URL, FUN, ..., N.TRIES = 1L)  :
  omnipath_download() failed:
  URL: https://omnipathdb.org/interactions?genesymbols=yes&datasets=ligrecextra&organisms=10090&fields=sources,references,curation_effort&license=academic
  error: cannot open the connection to 'https://omnipathdb.org/interactions?genesymbols=yes&datasets=ligrecextra&organisms=10090&fields=sources,references,curation_effort&license=academic'

Calls: <Anonymous> ... <Anonymous> -> import_omnipath -> do.call -> <Anonymous>
Execution halted

I found a discussion on this error message on the OmnipathR Github page (link) and they stated there that it could be due to a temporary disruption in the Omnipath service. However, I have been trying to run this command for the past three days and it has not resolved yet. Please let me know if you have any advice for me.

msraredon commented 2 years ago

Hmm, yes, this is a tricky one. I think this is as you say an issue with the OmniPath resource call and not part of the NICHES function. I don't think I can do anything to fix this on our end. A workaround would be to separately download the OmniPath database (you can see how I do this in 'LoadOmniPath.R', the code is also directly copied below) and then format it as an input for the 'custom' ground truth option in NICHES. That will allow you to run NICHEs without having to call OmniPath directly. I've never tried this but I think it should work, and would circumvent OmniPath server issues or even internet connection issue if that is what is going on.

Let me know if this doesn't work. Hope this helps. S

NICHES::LoadOmniPath.R code copied here for reference:

LoadOmniPath <- function(species){

Setup species call

if (species == 'human'){ organism = 9606 }else if (species == 'mouse'){ organism = 10090 }else if(species == 'rat'){ organism = 10116 }else{ stop("\nPlease select species for OmniPath mapping. Allows 'human','mouse',or 'rat' ") }

Ligand-Receptor Network ----

lr_Interactions_Omnipath <- OmnipathR::import_ligrecextra_interactions(organism = organism) %>% dplyr::select(source_genesymbol,target_genesymbol) %>% dplyr::distinct()

Tag with mechanism name

lr_Interactions_Omnipath$mechanism <- paste(lr_Interactions_Omnipath$source_genesymbol,lr_Interactions_Omnipath$target_genesymbol,sep = '-')

Identify max number of ligand subunits and max number of receptor subunits (based on "_" as a separator, used in current OmniPath iteration as of 2021-06-07)

source_sub_max <- max(sapply(lr_Interactions_Omnipath$sourcegenesymbol,function(x) length(strsplit(x,split="")[[1]]))) target_sub_max <- max(sapply(lr_Interactions_Omnipath$targetgenesymbol,function(x) length(strsplit(x,split="")[[1]])))

Initialize column names based on how many subunits are in initial database

source_colnames <- paste0("source",c(1:source_sub_max)) target_colnames <- paste0("target",c(1:target_sub_max))

Split into individual columns

temp <- tidyr::separate(data = lr_Interactions_Omnipath, col = source_genesymbol, # Split Source genes into = source_colnames, # Uses initialized column names sep = '', remove = F) temp <- tidyr::separate(data = temp, col = target_genesymbol, # Split Target genes into = target_colnames, # Uses initialized column names sep = '', remove = F)

Export subunit dataframe

source.subunits <- as.matrix(temp[,source_col_names]) #allows duplicate rownames rownames(source.subunits) <- temp$source_genesymbol target.subunits <- as.matrix(temp[,target_col_names]) #allows duplicate rownames rownames(target.subunits) <- temp$target_genesymbol

ground.truth <- list('source.subunits' = source.subunits, 'target.subunits' = target.subunits) return(ground.truth) }

bhavyaac commented 2 years ago

Hi, thank you for the quick reply! I will try your suggestion soon. For now I have loaded the fantom5 database instead by setting LR.database = 'fantom5', and this seems to work well.