presidential-innovation-fellows / clinical-trials-search

NOTE: THIS REPOSITORY IS NO LONGER BEING MAINTAINED.
8 stars 18 forks source link

Filtering on Nested Objects should be a Nested Filter #11

Closed bryanpizzillo closed 8 years ago

bryanpizzillo commented 8 years ago

The API was designed such that multiple filters against a single field are union operations (OR), and filters against multiple fields are intersections (AND). These operations make work correctly when it is against the fields of trials. For example, finding all lung cancer trials in New York city.

However, this raises issues with certain types of nested objects, specifically the trial sites. For example, when looking for trial sites that are in Springfield, Illinois the API will match any trial that has a study site with city = Springfield and also having a site with state = Illinois, but nothing constrains the match to those study sites that have BOTH city = Springfield AND state = illinois.

A real world example, trial NCT02496585, has two trial sites, one in New York, NY and one in Basking Ridge, NJ. I can submit a query that looks for any trial with State = NJ and City = New+york and this trial will match.

See the link below for an example: https://clinicaltrialsapi.cancer.gov/clinical-trials?nct_id=NCT02496585&sites.org.state_or_province=nj&sites.org.city=New+york

When performing filters against multiple fields within a nested object, the API should use the nested filter functionality of ElasticSearch.

blairlearn commented 8 years ago

A variation of this problem exists with full text search.

Doing a full-text search for "chicken" yields one trial https://clinicaltrialsapi.cancer.gov/v1/clinical-trials?_fulltext=chicken

Doing a full-text search for "chicken" within 100 miles of Bethesda yields 1101 trials https://clinicaltrialsapi.cancer.gov/v1/clinical-trials?_fulltext=chicken&sites.org_coordinates_lat=39.0038878&sites.org_coordinates_lon=-77.1053673&sites.org_coordinates_dist=100mi