Proteomicslab57357 / UniprotR

Retrieving Information of Proteins from Uniprot
GNU General Public License v3.0
59 stars 18 forks source link

ConvertID gives error "Internet connection problem occurs and the function will return the original error" #22

Closed tangwei1129 closed 1 year ago

tangwei1129 commented 2 years ago

Thank you for the R pkg to extract information from Uniprot. I was using ConvertID function to map ID from accessions to other databases such as Ensembl, but it gave me errors below, would you help to fix it?

ConvertID("P04406", ID_from = "ACC+ID", ID_to = "ENSEMBL_ID") Internet connection problem occurs and the function will return the original error cannot open the connection From_UniProtKB_AC_ID To ENSEMBL_ID 1 P04406
Warning message: In file(file, "rt") : cannot open URL 'https://www.uniprot.org/uploadlists/?query=P04406&format=tab&from=ACC+ID&to=ENSEMBL_ID': HTTP status was '404 Not Found'

AliYoussef96 commented 2 years ago

Hi @tangwei1129

Thank you for using our package. I believe this is due to the new API updates in UniProt. We are working to fix this in the package. For now, you can use the code provided by UniProt https://www.uniprot.org/help/id_mapping#submitting-an-id-mapping-job. I have edited it so you can use it easily for now.

ConvertID("P04406", ID_from = "ACC+ID", ID_to = "ENSEMBL_ID")

library(httr)

ids <- c("P04406") #c("P04406","P04407")
ids <- paste0(ids, collapse = ",")

from.id<- "UniProtKB_AC-ID"
to.id <- "Ensembl"

getResultsURL <- function(redirectURL) {
  if (grepl("/idmapping/results/", redirectURL, fixed = TRUE)) {
    url <- gsub("/idmapping/results/", "/idmapping/stream/", redirectURL)
  } else {
    url <- gsub("/results/", "/results/stream/", redirectURL)
  }
}

isJobReady <- function(jobId) {
  pollingInterval = 5
  nTries = 20
  for (i in 1:nTries) {
    url <- paste("https://rest.uniprot.org/idmapping/status/", jobId, sep = "")
    r <- GET(url = url, accept_json())
    status <- content(r, as = "parsed")
    if (!is.null(status[["results"]]) || !is.null(status[["failedIds"]])) {
      return(TRUE)
    }
    if (!is.null(status[["messages"]])) {
      print(status[["messages"]])
      return (FALSE)
    }
    Sys.sleep(pollingInterval)
  }
  return(FALSE)
}

files = list(
  ids = ids,
  from = from.id,
  to = to.id
)
r <- POST(url = "https://rest.uniprot.org/idmapping/run", body = files, encode = "multipart", accept_json())
submission <- content(r, as = "parsed")

if (isJobReady(submission[["jobId"]])) {
  url <- paste("https://rest.uniprot.org/idmapping/details/", submission[["jobId"]], sep = "")
  r <- GET(url = url, accept_json())
  details <- content(r, as = "parsed")
  url <- getResultsURL(details[["redirectURL"]])
  # Using TSV format see: https://www.uniprot.org/help/api_queries#what-formats-are-available
  url <- paste(url, "?format=tsv", sep = "")
  r <- GET(url = url, accept_json())
  resultsTable = read.table(text = content(r), sep = "\t", header=TRUE)
  print(resultsTable)
}