Closed kidlab closed 3 years ago
After looking at the latest source code. I saw that the distance field was removed and not included in the select query. I don't know why, maybe they did it because of some performance problem.
Here's my workaround: config/initializers/geokit_rails_patch.rb
module Geokit::ActsAsMappable
module ClassMethods
# Overwrite Geokit `within`
def within(distance, options = {})
options[:within] = distance
# Have to copy the options because `build_distance_sql` will delete them.
conditions = distance_conditions(options.dup)
sql = build_distance_sql(options)
self.select(
"#{sql} AS #{self.distance_column_name}, #{table_name}.*"
).where(conditions)
end
# Overwrite Geokit `by_distance`
def by_distance(options = {})
sql = build_distance_sql(options)
self.select("#{sql} AS #{self.distance_column_name}, #{table_name}.*"
).order("#{self.distance_column_name} ASC")
end
private
# Copied from geokit-rails/lib/geokit-rails/acts_as_mappable.rb
# Extract distance_sql building to method `build_distance_sql`.
def build_distance_sql(options)
origin = extract_origin_from_options(options)
units = extract_units_from_options(options)
formula = extract_formula_from_options(options)
distance_sql(origin, units, formula)
end
end
end
Yes I faced the same problem .. thanks for the patch. Hope it gets fixed in the main branch.
Glad to know the patch is useful for you.
Well, it seems that they're refactoring and have a lot of internal changes in the gem. The README is also out-updated. It has some instruction about geo_scope
method, but actually that method is removed! |o|
@kidlab @rohandey Please see https://github.com/geokit/geokit-rails/issues/39 It explains why the change is made, that I'm open to PR's to restore similar functionality, etc.
My apologies for the README going out of date. To be honest I'm tempted to remove many sections from the readme and just point people to the tests as they are always up to date and are usually great examples of how it can be used. I welcome any PR's to correct things.
@kidlab If you make that patch as a PR to geokit and it passes all the tests I'll merge it.
Oh, good to know the class DistanceCollection
. Thank you @mnoack. However, according to the source code of DistanceCollection
, it sorts Ruby objects (by using sort_by
) so it's not very efficient. Is it correct?
And yes, let me inspect more details the source code and create a PR for this. Hope it work :)
As far as I see, the performance impact isn't noticeable, i.e. In a real world scenario aren't we talking about showing the nearest 50 cafe's to someone, etc. So if we make it 100x more efficient it might go from 1/10th of a millisecond to sort to 1/1000th of a millisecond to sort.
Sometimes you sacrifice performance for quality and flexibility, having code that is easy to read/understand, etc. We're using ruby/rails for that same reason (dropping some minor performance for quality/simplicity)
I respect your point of view @mnoack. However, I'm working on a real-time updated location-based system that needs to find the nearest location within an origin. You know, with such an system, we need to figure out the fastest way of distance-calculating and sorting. With that point of view, the performance is the key point. For instance, if we can just use one query to list out 50 nearest cafes to someone, it is really cool. The more DB round-trips we can save, the more requests we can serve!
I'll happily accept a PR to improve any performance, the question I was asking though is, does 1/10th of 1 millisecond really matter?
If it does, then there's probably even more important things that can be done, e.g. upgrade to a newer rails/ruby version that might reduce the speed of your requests by over a 1 whole millisecond with less effort.
@mnoack I don't see the point of your 50 cafe example. Isn't the usual usecase to select 50 nearest cafes around a user out of a DB of ten-thousends of items? And in those cases it is not a question of millisecond-performance differences when you can pass over the calculation to the DB which can also make use of indices etc
I did some quick benchmarks looks like for 1000 iterations sorting the objects post-query via Ruby only increases execution time by (an average) 1/10th of a second. (Disclaimer, I have a very small dataset so obviously the more objects you have the bigger the gap in time will).
Benchmark.measure { 1000.times { Venue.within(5, origin: origin).sort_by { |v| v.distance_to(origin) } } }
Benchmark @real=1.487982, @cstime=0.0, @cutime=0.0, @stime=0.5699999999999994, @utime=0.7400000000000002, @total=1.3099999999999996
Benchmark.measure { 1000.times { Venue.within(5, origin: origin) } }
Benchmark @real=1.428282, @cstime=0.0, @cutime=0.0, @stime=0.5799999999999992, @utime=0.6799999999999997, @total=1.259999999999999
EDIT:
I just tried with 1000 rows in the table and 100 iterations, using sort_by was about 1/2 a second slower. (This time execution took ~25 seconds, but I imagine a lot of that was due to output to the console). That equates to about 5 milliseconds of added execution time. All in all, not a huge increase. Please correct me if I did my tests wrong, however.
@kidlab thanks! works great.
@kidlab I had some issues with your solution. When calling .count, bad SQL is generated as it tries to wrap the select clause in a COUNT(). The pluck method wipes out generated distance column. This is definitely not a criticism of your solution, just wanted to document my issues for future users. As of version 2.1.0, the solution for sorting by distance for me was to use geokit's by_distance method. It requires specifying an origin which is a bit redundant after having already provided one in the call to within, but the result was exactly what I was looking for.
Wondering why this isn't in master yet? Went to create a new Rails project and took me awhile to realise that I didn't have @kidlab's "within" override in an initializer (my other rails projects already did).
For all people (like me) hitting this thread, searching for a way to order results by distance with this Gem. This thread (and all linked issues) might give the impression that its not supported at all to order results by distance, for the reason of lacking a "distance" field.
Which is not true:
although (from OP)...
Location.within(5, origin: [37.792,-122.393]).order("distance asc")
...will not work because of the absence of the "distance" field, Like @bkmediagroup-josh metioned. Using the provided by_distance
scope works fine. So the equivalent would be:
Location.within(5, origin: [37.792,-122.393]).by_distance(origin: [37.792,-122.393])
Not sure about the internals and optimizations. But I just wanted to point out that this feature is provided as a custom scope, which is actually documented fine, but might be easy to miss.
Indeed, @sandergarretsen and @bkmediagroup-josh are correct, using by_distance
should be the solution.
I'm using geokit-rails 2.0.1. When I run:
An error occurred:
Why does the field
distance
go away?