ropensci / UCSCXenaTools

:package: An R package for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq https://cran.r-project.org/web/packages/UCSCXenaTools/
https://docs.ropensci.org/UCSCXenaTools
GNU General Public License v3.0
100 stars 12 forks source link

API function for querying single gene or sample does not work #5

Closed ShixiangWang closed 5 years ago

ShixiangWang commented 5 years ago

Use .p_dataset_probe_values and .p_dataset_gene_probe_avg as example.

library(UCSCXenaTools)
hub = "https://pancanatlas.xenahubs.net"
dataset = "EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena"
samples = c("TCGA-02-0047-01","TCGA-02-0055-01")
probes =c("TP53", "RB1")

Work:

> .p_dataset_probe_values(hub, dataset, samples, probes)
[[1]]
  strand chromend chromstart chrom
1      + 49056122   48877911 chr13
2      -  7590868    7565097 chr17

[[2]]
      [,1]  [,2]
[1,] 10.84  9.96
[2,] 11.22 10.15

> .p_dataset_gene_probe_avg(hub, dataset, samples, probes) 
  gene                     position       scores
1 TP53   -, 7590868, 7565097, chr17  10.84, 9.96
2  RB1 +, 49056122, 48877911, chr13 11.22, 10.15

Does not work for single sample:

> .p_dataset_probe_values(hub, dataset, "TCGA-02-0055-01", probes)
[[1]]
  strand chromend chromstart chrom
1      + 49056122   48877911 chr13
2      -  7590868    7565097 chr17

[[2]]
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,]  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN   NaN   NaN   NaN   NaN   NaN   NaN
[2,]  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN   NaN   NaN   NaN   NaN   NaN   NaN

  gene                     position                                                                    scores
1 TP53   -, 7590868, 7565097, chr17 NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN
2  RB1 +, 49056122, 48877911, chr13 NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN

Does not work for single probe (like gene):

> .p_dataset_probe_values(hub, dataset, samples, "TP53")
 Error in UCSCXenaTools:::.xena_post(host, UCSCXenaTools:::.call(xquery,  : 
  Internal Server Error (HTTP 500). 
> .p_dataset_gene_probe_avg(hub, dataset, samples, "TP53") 
 Error in UCSCXenaTools:::.xena_post(host, UCSCXenaTools:::.call(xquery,  : 
  Internal Server Error (HTTP 500). 

Interesting, the .p_dataset_gene_probes_values works for single gene, but not single sample

> .p_dataset_gene_probes_values(hub, dataset, samples, "TP53")
[[1]]
[[1]]$position
  strand chromend chromstart chrom
1      -  7590868    7565097 chr17

[[1]]$name
[1] "TP53"

[[2]]
      [,1] [,2]
[1,] 10.84 9.96

> .p_dataset_gene_probes_values(hub, dataset, "TCGA-02-0047-01", "TP53")
[[1]]
[[1]]$position
  strand chromend chromstart chrom
1      -  7590868    7565097 chr17

[[1]]$name
[1] "TP53"

[[2]]
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,]  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN   NaN   NaN   NaN   NaN   NaN   NaN
ShixiangWang commented 5 years ago

The xenaPython package has the same problem, but it works when input is a list

In [6]: xena.dataset_probe_values (hub, dataset, samples, ["TP53"])                                                                     
Out[6]: 
[[{'strand': '-',
   'chromend': 7590868,
   'chromstart': 7565097,
   'chrom': 'chr17'}],
 [[10.84, 9.96, 11.91, 11.39]]]

In [5]: xena.dataset_probe_values (hub, dataset, samples, "TP53")                                                                       
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-5-9a5309f9f398> in <module>
----> 1 xena.dataset_probe_values (hub, dataset, samples, "TP53")

~/anaconda3/lib/python3.6/site-packages/xenaPython/__init__.py in <lambda>(host, dataset, samples, probes)

~/anaconda3/lib/python3.6/site-packages/xenaPython/xenaQuery.py in post(url, query)
    199     """POST a xena data query to the given url."""
    200     req = Request(url + '/data/', query.encode(), headers)
--> 201     response = urlopen(req)
    202     result = response.read().decode('utf-8')
    203     return result

~/anaconda3/lib/python3.6/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    221     else:
    222         opener = _opener
--> 223     return opener.open(url, data, timeout)
    224 
    225 def install_opener(opener):

~/anaconda3/lib/python3.6/urllib/request.py in open(self, fullurl, data, timeout)
    530         for processor in self.process_response.get(protocol, []):
    531             meth = getattr(processor, meth_name)
--> 532             response = meth(req, response)
    533 
    534         return response

~/anaconda3/lib/python3.6/urllib/request.py in http_response(self, request, response)
    640         if not (200 <= code < 300):
    641             response = self.parent.error(
--> 642                 'http', request, response, code, msg, hdrs)
    643 
    644         return response

~/anaconda3/lib/python3.6/urllib/request.py in error(self, proto, *args)
    568         if http_err:
    569             args = (dict, 'default', 'http_error_default') + orig_args
--> 570             return self._call_chain(*args)
    571 
    572 # XXX probably also want an abstract factory that knows when it makes

~/anaconda3/lib/python3.6/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    502         for handler in handlers:
    503             func = getattr(handler, meth_name)
--> 504             result = func(*args)
    505             if result is not None:
    506                 return result

~/anaconda3/lib/python3.6/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    648 class HTTPDefaultErrorHandler(BaseHandler):
    649     def http_error_default(self, req, fp, code, msg, hdrs):
--> 650         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    651 
    652 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 500: Server Error
ShixiangWang commented 5 years ago

The problem comes from UCSCXenaTools:::.call

image

\"TP53\" should be changed to [\"TP53\"]

ShixiangWang commented 5 years ago

Fix this problem by passing a list instead of a character vector.

> .p_dataset_probe_values(hub, dataset, samples, "TP53")
 Error in UCSCXenaTools:::.xena_post(host, UCSCXenaTools:::.call(xquery,  : 
  Internal Server Error (HTTP 500). 
> .p_dataset_probe_values(hub, dataset, samples, list("TP53"))
[[1]]
  strand chromend chromstart chrom
1      -  7590868    7565097 chr17

[[2]]
      [,1] [,2]
[1,] 10.84 9.96
> .p_dataset_probe_values(hub, dataset, samples, list("TP53"))
[[1]]
  strand chromend chromstart chrom
1      -  7590868    7565097 chr17

[[2]]
      [,1] [,2]
[1,] 10.84 9.96

> .p_dataset_gene_probe_avg(hub, dataset, samples, as.list("TP53") )
  gene                   position      scores
1 TP53 -, 7590868, 7565097, chr17 10.84, 9.96
> .p_dataset_gene_probes_values(hub, dataset, list("TCGA-02-0047-01"), list("TP53"))
[[1]]
[[1]]$position
  strand chromend chromstart chrom
1      -  7590868    7565097 chr17

[[1]]$name
[1] "TP53"

[[2]]
      [,1]
[1,] 10.84