Open biocyberman opened 6 years ago
Thanks for the suggestion. I've added the functions searchDatasets()
, searchAttributes()
, and searchFilters()
. These take a mart
argument and a pattern
, which is a regex string that matches against all columns returned by the appropriate listX
function, e.g.
ensemblMart <- useEnsembl("ensembl")
searchDatasets(pattern = "norvegicus", mart = ensemblMart)
dataset description version
87 rnorvegicus_gene_ensembl Rat genes (Rnor_6.0) Rnor_6.0
ensemblMart <- useDataset(dataset = "rnorvegicus_gene_ensembl",
mart = ensemblMart)
searchFilters(mart = ensemblMart, pattern = "ensembl.*id$")
name description
51 ensembl_gene_id Gene stable ID(s) [e.g. ENSRNOG00000000001]
53 ensembl_transcript_id Transcript stable ID(s) [e.g. ENSRNOT00000000008]
55 ensembl_peptide_id Protein stable ID(s) [e.g. ENSRNOP00000000008]
57 ensembl_exon_id Exon ID(s) [e.g. ENSRNOE00000000009]
Let me know if that fits what you're looking for.
Fantastic! I will try and see.
@grimbough The functions work generally much better and their list* counterparts. It will be even better if you care to implement what
to search as well. They currently search through all columns, we would have more fine grain control over the search with what
.
In many cases, I "list" something just to find the correct name for datasets or attribute I want to use.
list*
methods are useful for exploring what are there, but their output are inconvenient to use for finding a particular name. It's because the output is lengthy, and I have to usegrep
with unsatisfactory result quite sometimes.Here is my current workflow:
So I want to request
find*
methods, with wildcards, regex and fuzzy match support:This will simpilfy the workflow and save time