itstommymorgan / asari

a Ruby wrapper for AWS CloudSearch.
52 stars 51 forks source link

Native latlon queries with the geo: {...} directive #42

Open david-l-young opened 10 years ago

david-l-young commented 10 years ago

Friends,

This is not an official pull request. I just wanted to let you know what I have done that support our requirements at Pivotdesk. You can browse the code changes on the 'native-latlon' branch (based on your 1.0 branch). If it looks like something you want to fold back into your project, I will update the documentation and test suite.

I have written extensive tests for our own project that makes use of this forked version, so I am confident that it works. Here is an overview of the changes...

Largely we wanted native 2013-01-01 support for geographic queries. I have implemented this as a new search keyword (geo:) that appears at the same level as that for boolean queries (filter:).

This new keyword supports sorting results relative to a geographic location, or filtering results that are outside of a radius from a geographic location.

The long form looks like this...

-- find documents that mention "donuts" within 5 miles of geographic center of Boulder, CO (assuming documents have a latlon field named 'location') .search( "donuts", filter: ..., geo: {field: location, latitude: 40.018248, longitude: -105.278163, radius: 5, units: :miles})

The short form allows you to fold the lat/lon in as a value for a geographic field like this...

.search( "donuts", geo: {location: [40.018248, -105.278163], radius: 5, units: :miles)

Other available units for the radius are :kilometers (the default), :meters, or :degrees.

If you just want to sort results by their distance from the location, and not filter anything, it would look like this...

.search( "donuts", geo: {field: location, latitude: 40.018248, longitude: -105.278163, sort: true})

If you don't specify a radius, sorting by distance is the default, so a short form of the above could look like this...

.search( "donuts", geo: {location: [40.018248, -105.278163]})

You can also specify a radius and ask for sort: true, will do both (filter by a radius and sort by the distance from the center).

As I went along, I also fixed some other issues in the filter: syntax...

[1] When filter: was present, the text matching query was being ignored.

[2] Instead of defaulting missing terms to '', leave them out so that the DEFAULT_VALUE on the AWS declarations will kick in.

[3] Implemented 2013-01-01 syntax for more filter terms including open ended ranges and date ranges. You can see this code in the normalize_* functions in the asari.rb class.

[4] Providing an array of values in a filter (filter: {name: ['chocolate cake', 'strawberry frosted']}) generates a query for an OR of the values.

Anyway, let me know if you are interested in an official Pull Request for this work and I will find time to better package it up for you.

Best Regards,

David L Young CTO at Pivotdesk david.young@pivotdesk.com 303.916.6942 @PivotdeskCTO

zrisher commented 10 years ago

Personally I would love to see this implemented. Even though I don't currently use geo search, Other Issues 1-4 would be great to have fixed.