ropensci / UCSCXenaTools

:package: An R package for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq https://cran.r-project.org/web/packages/UCSCXenaTools/
https://docs.ropensci.org/UCSCXenaTools
GNU General Public License v3.0
106 stars 12 forks source link

xenaPython对外开放的API函数 #6

Closed ShixiangWang closed 5 years ago

ShixiangWang commented 5 years ago
from . import xenaQuery as xena

def Gene_values (hub, dataset, samples, gene):
    values = xena.dataset_gene_values (hub, dataset, samples, [gene])
    return values[0]["scores"][0]

def Genes_values (hub, dataset, samples, genes):
    values = [x["scores"][0] for x in xena.dataset_gene_values (hub, dataset, samples, genes)]
    return values

def Probe_values (hub, dataset, samples, probe):
    values = xena.dataset_probe_values (hub, dataset, samples, [probe])
    return values[0]

def Probes_values (hub, dataset, samples, probes):
    values = xena.dataset_probe_values (hub, dataset, samples, probes)
    return values

def dataset_samples (hub,dataset):
    return xena.dataset_samples(hub, dataset)

def dataset_fields (hub, dataset):
    return xena.dataset_field (hub, dataset)

def all_cohorts(hub):
    return xena.all_cohorts(hub)
ShixiangWang commented 5 years ago

例子:

#### Usage
    >>> import xenaPython as Xena

#### Examples

##### 1: Query four samples and three identifers expression
    import xenaPython as xena

    hub = "https://toil.xenahubs.net"
    dataset = "tcga_RSEM_gene_tpm"
    samples = ["TCGA-02-0047-01","TCGA-02-0055-01","TCGA-02-2483-01","TCGA-02-2485-01"]
    probes = ['ENSG00000282740.1', 'ENSG00000000005.5', 'ENSG00000000419.12']
    [position, [ENSG00000282740_1, ENSG00000000005_5, ENSG00000000419_12]] = xena.dataset_probe_values(hub, dataset, samples, probes)
    ENSG00000282740_1

##### 2: Query four samples and three genes expression, when the dataset you want to query has a identifier-to-gene mapping (i.e. xena probeMap)
    hub = "https://toil.xenahubs.net"
    dataset = "tcga_RSEM_gene_tpm"
    samples = ["TCGA-02-0047-01","TCGA-02-0055-01","TCGA-02-2483-01","TCGA-02-2485-01"]
    genes =["TP53", "RB1", "PIK3CA"]
    xena.dataset_gene_probe_avg(hub, dataset, samples, genes)

##### 3: If the dataset does not have id-to-gene mapping, but the dataset used gene names as its identifier, you can query gene expression like example 1, example 2 will not work.
    hub = "https://toil.xenahubs.net"
    dataset = "tcga_RSEM_Hugo_norm_count"
    samples = ["TCGA-02-0047-01","TCGA-02-0055-01","TCGA-02-2483-01","TCGA-02-2485-01"]
    probes =["TP53", "RB1", "PIK3CA"]
    [position, [TP53, RB1, PIK3CA]] = xena.dataset_probe_values (hub, dataset, samples, probes)
    TP53

##### 4: Find out the samples in a dataset
    hub = "https://tcga.xenahubs.net"
    dataset = "TCGA.BLCA.sampleMap/HiSeqV2"
    xena.dataset_samples (hub, dataset, 10)
    xena.dataset_samples (hub, dataset, None)

##### 5: Find out the identifiers in a dataset
    hub = "https://tcga.xenahubs.net"
    dataset = "TCGA.BLCA.sampleMap/HiSeqV2"
    xena.dataset_field (hub, dataset)

##### 6. Find out the number of idnetifiers in a dataset
    hub = "https://tcga.xenahubs.net"
    dataset = "TCGA.BLCA.sampleMap/HiSeqV2"
    xena.dataset_field_n (hub, dataset)

##### 7. Find out hub id, dataset id
    use xena browser datasets tool:  https://xenabrowser.net/datapages/
ShixiangWang commented 5 years ago

XenaData的数据列ProbeMap可以用于判断数据集是否有probemap,从来区分是否可以使用gene symbol进行检索。

ShixiangWang commented 5 years ago

The use cases have been introduced at the API documentation

CSUXu commented 5 years ago

希望能留个邮箱地址或者微信,方便请教

ShixiangWang commented 5 years ago

@CSUXu 邮箱主页就有。不过建议以Github issue进行讨论,这样更有针对性,也可以帮助其他人。

ShixiangWang commented 5 years ago

这些函数都已经在 v1.2.2中实现了,因此关闭issue。