ucscGenomeBrowser / kent

UCSC Genome Browser source tree. Stable branch: "beta".
http://genome.ucsc.edu/
Other
219 stars 89 forks source link

Feature Request: REST API Support for selecting track Data using name identifiers #47

Closed sanchit-saini closed 3 years ago

sanchit-saini commented 3 years ago

Hello, Is it possible to support a custom selection of track data using name identifiers filter? somewhat like /getData/track?genome=hg38;track=gold;names=AC008953.7,AL671879.2,KF458873.1

The purpose for making this request is rtracklayer provides an interface to query the UCSC table browser and it is in the process of migration to rely on the UCSC REST API.

Currently, for filtering name identifiers, whole track data is transferred and search on the client-side which is very costly on the network.I think it would be far better if the name filter is applied on the server-side.

genome-www commented 3 years ago

Thanks! Good point. We will need special code to make this work for the different track types. Which tracks do you think are a priority for this feature?

maximilianh commented 3 years ago

I imagine that the main gene tracks (Gencode for hg38 and knownGenes for hg19), actually, any gene track that is using the genePred format, would be the first candidate for this feature.

Can you tell us how rTrackLayers is currently using the table browser to do filtering - does it use the table browser filter feature?

On Mon, Dec 7, 2020 at 1:06 PM genome-www notifications@github.com wrote:

Thanks! Good point. We will need special code to make this work for the different track types. Which tracks do you think are a priority for this feature?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ucscGenomeBrowser/kent/issues/47#issuecomment-739876905, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TIB2IEDVSYT3YQJM63STTAKRANCNFSM4UQLNZPA .

sanchit-saini commented 3 years ago

Thanks! Good point. We will need special code to make this work for the different track types. Which tracks do you think are a priority for this feature?

As suggested by @maximilianh, I think gene tracks should be the first candidate for this feature.

sanchit-saini commented 3 years ago

I imagine that the main gene tracks (Gencode for hg38 and knownGenes for hg19), actually, any gene track that is using the genePred format, would be the first candidate for this feature. Can you tell us how rTrackLayers is currently using the table browser to do filtering - does it use the table browser filter feature? On Mon, Dec 7, 2020 at 1:06 PM genome-www @.***> wrote: Thanks! Good point. We will need special code to make this work for the different track types. Which tracks do you think are a priority for this feature? > — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#47 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TIB2IEDVSYT3YQJM63STTAKRANCNFSM4UQLNZPA .

Yes, rtracklayer does use the name identifier filter feature, for now, it is making a request to /getData/track with a genome and track to retrieve specified track data, After retrieving it is stored inside a data frame, and lastly, the data frame is subsetted with the user specified name column.

https://github.com/sanchit-saini/rtracklayer/blob/3aff923df3c16f8a0b19e39ff5a3fc44a813d40a/R/ucsc.R#L545

maximilianh commented 3 years ago

I assume that you would like to get an additional argument for /getData/track to retrieve only transcripts with a given transcriptId (=a given "name" field) ?

sanchit-saini commented 3 years ago

Assuming, transcriptId is equivalent to name identifier, an additional argument would be required(e.g names).

identifiers (selected tracks only): Restricts the output to table data that match a list of identifiers, for instance RefSeq accessions for the RefSeq track. If no identifiers are entered, all table data within the specified region will be displayed." source: https://genome.ucsc.edu/cgi-bin/hgTables

braneyboo commented 3 years ago

We added this feature request to our ticket system. Thanks for asking! We're MUCH more likely to do things that users have requested.