warelab / gramoogle

Gramene Search Interface
MIT License
0 stars 4 forks source link

Filter Results 1 #7

Closed mycrobe closed 8 years ago

mycrobe commented 9 years ago

As a user, I wish to be able to filter the result set of genes by specific metadata, e.g. species, gene tree, or interpro domain.

By default the filter UI is in a simple, small configuration that does not take up too much screen real estate. It displays the number of results and genomes in a left-justified sentence with a "Filter" button on the right hand side. When this button is pressed the filter UI slides down to allow the result set to be filtered.

filter ui

Implement filter interface to restrict results to a specific:-

Each of these filters are arranged as columns in the filter UI (see image). Only the first three (species, tree, domain) are visible and the others can be found by scrolling to the right.

In this initial version, do not implement text filter boxes from the photo. For query simplicity, use radio buttons instead of checkboxes so that only one e.g. species can be selected.

ajo2995 commented 9 years ago

The species filter can alternatively be implemented in the results component that displays the species tree and genomic/gene space distribution. The filter is defined based on the species that are displayed. As the user hides/expands subtrees the filter is modified and the results are updated at once.

ajo2995 commented 9 years ago

The set of filters in each category should be based on their frequency in the result set.

ajo2995 commented 9 years ago

The user needs to see which filters have been applied. The user needs to be able to turn off filters (one at a time)

ajo2995 commented 9 years ago

Other possible filters: biotype location - implement on genomic distribution?

ajo2995 commented 9 years ago

The data which populate the filters can come from a query like this:

http://data.gramene.org/search/genes?q=*&rows=0&facet=true&facet.threads=-1&facet.field=bin_10Mb&facet.field=GO_ancestors&facet.field=interpro_ancestors&facet.field=PO_ancestors&facet.field=NCBITaxon_ancestors&facet.field=biotype&facet.mincount=1&facet.limit=-1

This is the send lots of data approach (note facet.limit=-1 means give me everything). The genome bins field (bin_10Mb) needs to be sent this way for the vis, and the others might be useful to have in local stores when searching for a domain to filter by so only relevant domains appear. I didn't include the gene tree field because you would probably want to limit the number displayed.

The only problem is the size of the response - about half a megabyte for a non-specific search returning all genes. The response time can be improved by caching. Solr doesn't necessarily realize that we are a read only database...

mycrobe commented 9 years ago

Asses the available search filters in biomart. e.g. Homologues.

mycrobe commented 9 years ago

Consider allowing paste of lists of domains into this interface.

ajo2995 commented 9 years ago

RYO filter for power users?