m-lab / mlab-vis-api

API component of New and Exciting visualization collaboration with Measurement Lab
MIT License
2 stars 4 forks source link

implement facet search #7

Closed vlandham closed 8 years ago

vlandham commented 8 years ago

Power tool component of client requires more sophisticated search capabilities.

Current strategy for dealing these searches:

There are 3 facets to orient search around:

Each facet allows for searching of all 3 entities but with different constraints.

With the facets Here are the different entity/filtering combinations we have:

A tradeoff with this approach limits us to filtering by a single entity. This works well with our facet type UI - but means that client isp, for example, could only be filtered by location - not location and transit isp

Possible Implementation

We have an existing solution to filtering by one entity in place:

client_location_client_asn is a table with a compound key of 2 values:

This provides a solution for filtering a client isp by a location. If we expand this solution to include the additional filtering we need, it would involve making these tables:

where ✔️ indicates already existing tables.

Our current implementation of location_search only provides starts with name matching - due to bigtable query capabilities. This is a limitation of client_asn_search This would probably be a limitation for transit_asn_search as well.

For filtering. the keys are all built from asn keys and location keys. So the easiest thing to do would be to pass all results matching a filter (for example, all transit isps with client location of 'new york') to the api. If the resulting dataset was large, further filtering could occur at the api level (example: only pull out values with name containing 'lev').

The UI has a need for allowing multiple values in the initial filter. We will use these as union values for the filterable dependent entities.

For example. On facet by location, new york and london are selected. Now the user is selecting a client_isp. The API in this scenario must return client isps that appear in new york or london.

This could be implemented at the API level by acquiring both client isp lists from bigtable, and then performing the union there.

pbeshai commented 8 years ago

Thanks for writing this up Jim, looks good to me :+1:

vlandham commented 8 years ago

done.