alexreisner / geocoder

Complete Ruby geocoding solution.
http://www.rubygeocoder.com
MIT License
6.35k stars 1.19k forks source link

Distance column doesn't exist as soon as I want to do some calculations #1205

Open Kulgar opened 7 years ago

Kulgar commented 7 years ago

Hello,

Expected behavior

I am not sure if it is a bug... But I definitely need your help. I am trying to order results with a calculation on distances, like this:

User.near([lat, lng], 60, :order => "distance - 100 * rating")

So that a user with a 5 star ratings at 500 meters will appear before a user with 0 star ratings at 300 meters.

Actual behavior

Unfortunately as soon as I add any calculations on the distance field, I get this error:

column "distance" doesn't exist

Even if I only do something like "distance - 5"

The generated SQL query is:

SELECT sitters.*, 6371.0 * 2 * ASIN(SQRT(POWER(SIN((48.6060508 - sitters.latitude) * PI() / 180 / 2), 2) + COS(48.6060508 * PI() / 180) * COS(sitters.latitude * PI() / 180) * POWER(SIN((2.3040542 - sitters.longitude) * PI() / 180 / 2), 2))) AS distance, MOD(CAST((ATAN2( ((sitters.longitude - 2.3040542) / 57.2957795), ((sitters.latitude - 48.6060508) / 57.2957795)) * 57.2957795) + 360 AS decimal), 360) AS bearing FROM "sitters" WHERE "sitters"."available" = $1 AND ("sitters"."birthday" IS NOT NULL) AND ("sitters"."address_id" IS NOT NULL) AND ("sitters"."phone" IS NOT NULL) AND ("sitters"."mangopay_id" IS NOT NULL) AND (birthday < '1999-07-21 18:36:56.799307') AND (sitters.latitude BETWEEN -539544.3575004383 AND 539641.5696020382 AND sitters.longitude BETWEEN -816038.7180184539 AND 816043.3261268538 AND (6371.0 * 2 * ASIN(SQRT(POWER(SIN((48.6060508 - sitters.latitude) * PI() / 180 / 2), 2) + COS(48.6060508 * PI() / 180) * COS(sitters.latitude * PI() / 180) * POWER(SIN((2.3040542 - sitters.longitude) * PI() / 180 / 2), 2)))) BETWEEN 0.0 AND '60000000')  ORDER BY distance - 5 * rating

I tested without distance:

User.near([lat, lng], 60, :order => "100 * rating")

This works fine.

Environment info

Geocoder version: 1.4.4 Rails version: 4.2.5

Maybe you will find another (better?) solution for what I want to do. Any advice/thoughts/help will be appreciated.

But I still find this behaviour a little bit strange, as we should be able to do some operations on the distance field, shouldn't we?

Many thanks! Kulgar

TrangPham commented 7 years ago

What DB are you using?

Each DB handles SQL syntax slightly differently unfortunately.

What's happening here is we are trying to use a column alias distance in the order by clause.

This should work if you are using MYSQL. See: https://stackoverflow.com/questions/13031013/how-do-i-use-alias-in-where-clause

Kulgar commented 7 years ago

Unfortunately I am using postgresql...

victorhazbun commented 6 years ago

I'm having the exact same problem with Postgresql

alexreisner commented 6 years ago

According to this comment geokit-rails solves this problem (by avoiding aliased columns?). We should look into how it works since this seems to be a common issue for people.

ryanpool commented 6 years ago

I just had a vaguely similar problem recently because I ignored the warning not to use .includes with .near. When I had .includes(:other_model_relationship) Rails generated the query using .eager_load() and the alias for distance wasn't there. I changed from .includes() to .joins() and the alias for distance started showing up again.

Sharing in case that helps anyone else.

allantokuda-zipnosis commented 4 years ago

Seemingly related, I am seeing that I cannot do use ActiveRecord .count (and hence sometimes also .size) on the result. Seems like it's caused by there being that AS alias inside of ActiveRecord's added COUNT which Postgres doesn't like:

Location.near('Minneapolis', 10).count

   (1.3ms)  SELECT COUNT(locations.*, 3958.755864232 * 2 * ASIN(SQRT(POWER(SIN((44.977753 - locations.latitude) * PI() / 180 / 2), 2) + COS(44.977753 * PI() / 180) * COS(locations.latitude * PI() / 180) * POWER(SIN((-93.2650108 - locations.longitude) * PI() / 180 / 2), 2))) AS distance, MOD(CAST((ATAN2( ((locations.longitude - -93.2650108) / 57.2957795), ((locations.latitude - 44.977753) / 57.2957795)) * 57.2957795) + 360 AS decimal), 360) AS bearing) FROM "locations" WHERE (locations.latitude BETWEEN 44.83302121688915 AND 45.12248478311085 AND locations.longitude BETWEEN -93.46961302236085 AND -93.06040857763915 AND (3958.755864232 * 2 * ASIN(SQRT(POWER(SIN((44.977753 - locations.latitude) * PI() / 180 / 2), 2) + COS(44.977753 * PI() / 180) * COS(locations.latitude * PI() / 180) * POWER(SIN((-93.2650108 - locations.longitude) * PI() / 180 / 2), 2)))) BETWEEN 0.0 AND 10)
ActiveRecord::StatementInvalid: PG::SyntaxError: ERROR:  syntax error at or near "AS"
LINE 1: ...08 - locations.longitude) * PI() / 180 / 2), 2))) AS distanc...
                                                             ^

Would be nice if ActiveRecord wrapped the entire query in a COUNT, but instead it only wraps the first portion between SELECT ... FROM.

Calling length works but more expensive to load all data into memory first only to count the results.

Location.near('Minneapolis', 10).length

  Location Load (1.0ms)  SELECT locations.*, 3958.755864232 * 2 * ASIN(SQRT(POWER(SIN((44.977753 - locations.latitude) * PI() / 180 / 2), 2) + COS(44.977753 * PI() / 180) * COS(locations.latitude * PI() / 180) * POWER(SIN((-93.2650108 - locations.longitude) * PI() / 180 / 2), 2))) AS distance, MOD(CAST((ATAN2( ((locations.longitude - -93.2650108) / 57.2957795), ((locations.latitude - 44.977753) / 57.2957795)) * 57.2957795) + 360 AS decimal), 360) AS bearing FROM "locations" WHERE (locations.latitude BETWEEN 44.83302121688915 AND 45.12248478311085 AND locations.longitude BETWEEN -93.46961302236085 AND -93.06040857763915 AND (3958.755864232 * 2 * ASIN(SQRT(POWER(SIN((44.977753 - locations.latitude) * PI() / 180 / 2), 2) + COS(44.977753 * PI() / 180) * COS(locations.latitude * PI() / 180) * POWER(SIN((-93.2650108 - locations.longitude) * PI() / 180 / 2), 2)))) BETWEEN 0.0 AND 10) ORDER BY distance ASC
2

It seems that in this scenario the distance and bearing columns are not being used, so they probably should not be selected at all. That would allow there to be no aliases and allow the query to succeed. Wondering if such logic is possible to apply through this gem?

alexreisner commented 4 years ago

@allantokuda-zipnosis does using count(:all) work?

allantokuda-zipnosis commented 4 years ago

Thank you @alexreisner, yes this does work! Location.all.near('Minneapolis', 100).count(:all)

In case it helps anyone, I got stuck again at a related problem where our app actually further refines the query using a pagination limit. We're using this combined with "offset" to do pagination, and then calculating a count of results on current page. Expected behavior is that the count will equal the per-page count unless we are on the last page, which typically has a smaller count. Location.all.near('Minneapolis', 100).limit(1000).count(:all)

The query ran into an error after upgrading Rails from 5.0 to 5.1.

# Rails 5.0
[1] pry(main)> Location.all.near('Minneapolis', 100).limit(1000).count(:all)
   (2.2ms)  SELECT COUNT(count_column) FROM (SELECT  1 AS count_column FROM "locations" WHERE (locations.latitude BETWEEN 43.53043516889152 AND 46.42507083110848 AND locations.longitude BETWEEN -95.31103302360845 AND -91.21898857639155 AND (3958.755864232 * 2 * ASIN(SQRT(POWER(SIN((44.977753 - locations.latitude) * PI() / 180 / 2), 2) + COS(44.977753 * PI() / 180) * COS(locations.latitude * PI() / 180) * POWER(SIN((-93.2650108 - locations.longitude) * PI() / 180 / 2), 2)))) BETWEEN 0.0 AND 100) LIMIT $1) subquery_for_count  [["LIMIT", 1000]]
2

# Rails 5.1
[1] pry(main)> Location.all.near('Minneapolis', 100).limit(1000).count(:all)
   (7.8ms)  SELECT COUNT(*) FROM (SELECT  1 AS one FROM "locations" WHERE (locations.latitude BETWEEN 43.53043516889152 AND 46.42507083110848 AND locations.longitude BETWEEN -95.31103302360845 AND -91.21898857639155 AND (3958.755864232 * 2 * ASIN(SQRT(POWER(SIN((44.977753 - locations.latitude) * PI() / 180 / 2), 2) + COS(44.977753 * PI() / 180) * COS(locations.latitude * PI() / 180) * POWER(SIN((-93.2650108 - locations.longitude) * PI() / 180 / 2), 2)))) BETWEEN 0.0 AND 100) ORDER BY distance ASC LIMIT $1) subquery_for_count  [["LIMIT", 1000]]
ActiveRecord::StatementInvalid: PG::UndefinedColumn: ERROR:  column "distance" does not exist
LINE 1: ...) / 180 / 2), 2)))) BETWEEN 0.0 AND 100) ORDER BY distance A...
                                                             ^

This is because count querying behavior changed to include an order clause even for count operations: https://github.com/rails/rails/commit/62bb8b9dfdf4101b4cefedb36a97531e750d623c

Fixes:

vlad-pisanov commented 4 years ago

Same problem (PG):

If I do Location.near('Whatever', 1).ids I get ActiveRecord::StatementInvalid: PG::UndefinedColumn: ERROR: column "distance" does not exist

so I have to resort to Location.near('Whatever', 1, select_distance: false, select_bearing: false, order: false).ids

jdurand commented 4 years ago

I'm having a similar issue with the following query (simplified). Removing wat from the includes produces a query that doesn't raise the column "distance" does not exist error :

Entity.joins(foo: { bars: { baz: :wat }}).includes(foo: { bars: { baz: :wat }})

SELECT DISTINCT "distance" AS alias_0, "entities"."id" FROM "entities" INNER JOIN "foos" ON "foos"."deleted_at" IS NULL AND "foos"."aasm_state" != 'archived' AND "foos"."id" = "entities"."foo_id" INNER JOIN "bars" ON "bars"."foo_id" = "foos"."id" INNER JOIN "bazs" ON "bazs"."bar_id" = "bars"."id" INNER JOIN "wats" ON "wats"."id" = "bazs"."wat_id" LEFT OUTER JOIN "menu_menus" ON "menu_menus"."deleted_at" IS NULL AND "menu_menus"."entity_id" = "entities"."id" WHERE "entities"."deleted_at" IS NULL AND (entities.latitude BETWEEN -25.549873255423982 AND 119.18190985542398 AND entities.longitude BETWEEN -176.96328328997612 AND 34.52675008997609 AND (3958.755864232 * 2 * ASIN(SQRT(POWER(SIN((46.8160183 - entities.latitude) * PI() / 180 / 2), 2) + COS(46.8160183 * PI() / 180) * COS(entities.latitude * PI() / 180) * POWER(SIN((-71.2182666 - entities.longitude) * PI() / 180 / 2), 2)))) BETWEEN 0.0 AND 5000) ORDER BY "distance" ASC LIMIT 20 OFFSET 0)

Raises the exception

Entity.joins(foo: { bars: { baz: :wat }}).includes(foo: { bars: :baz })

SELECT entities.*, 3958.755864232 * 2 * ASIN(SQRT(POWER(SIN((46.8160183 - entities.latitude) * PI() / 180 / 2), 2) + COS(46.8160183 * PI() / 180) * COS(entities.latitude * PI() / 180) * POWER(SIN((-71.2182666 - entities.longitude) * PI() / 180 / 2), 2))) AS distance, MOD(CAST((ATAN2( ((entities.longitude - -71.2182666) / 57.2957795), ((entities.latitude - 46.8160183) / 57.2957795)) * 57.2957795) + 360 AS decimal), 360) AS bearing FROM "entities" INNER JOIN "foos" ON "foos"."deleted_at" IS NULL AND "foos"."aasm_state" != 'archived' AND "foos"."id" = "entities"."foo_id" INNER JOIN "bars" ON "bars"."foo_id" = "foos"."id" INNER JOIN "bazs" ON "bazs"."bar_id" = "bars"."id" INNER JOIN "wats" ON "wats"."id" = "bazs"."wat_id" WHERE "entities"."deleted_at" IS NULL AND (entities.latitude BETWEEN -25.549873255423982 AND 119.18190985542398 AND entities.longitude BETWEEN -176.96328328997612 AND 34.52675008997609 AND (3958.755864232 * 2 * ASIN(SQRT(POWER(SIN((46.8160183 - entities.latitude) * PI() / 180 / 2), 2) + COS(46.8160183 * PI() / 180) * COS(entities.latitude * PI() / 180) * POWER(SIN((-71.2182666 - entities.longitude) * PI() / 180 / 2), 2)))) BETWEEN 0.0 AND 5000) ORDER BY "distance" ASC LIMIT 20 OFFSET 0

Works!

innoc commented 3 years ago

Has this been fixed? Having a similar issue.

mrjonesbot commented 2 years ago

Same problem (PG):

If I do Location.near('Whatever', 1).ids I get ActiveRecord::StatementInvalid: PG::UndefinedColumn: ERROR: column "distance" does not exist

so I have to resort to Location.near('Whatever', 1, select_distance: false, select_bearing: false, order: false).ids

Can confirm that this adjustment is still needed for ordered/paginated queries (thank you @vlad-pisanov ).

andyrue commented 1 year ago

It's disappointing this has been open for so long. I was migrating from geokit to geocoder and ran into this problem on Postrgresql.

If it's any help, it looks like geokit doesn't include distance by default and instead uses a DistanceCollection class to add that when needed. https://www.rubydoc.info/gems/geokit-rails/2.1.0/DistanceCollection

joshuacronemeyer commented 10 months ago

+1 for improving this. Without being able to eager_load when geocoder is in play, avoiding n+1 becomes a problem. Current solution is 2 queries:

ItemController#index pseudocode

  @items = Item.joins(:categories).where(categories: {id: category.id}).near([lat,long], radius).distinct.page
  @eager_loaded_items = Item.where(id: items.map(&:id)).includes(:categories)

Then in my index.html.erb I'll use @items for kaminari pagination stuff and @eager_loaded_items where i actually render the items so I don't get n+1 when the view references the categories of the item... (putting this here in case it helps people or somebody has a better way they can tell me. For our app without eager loading it adds multiple seconds to rendering the view)

xnubar commented 5 months ago

I was facing the same issue, it started to work after I create:

# config/initializers/geocoder.rb
Geocoder.configure(
  units: :km,
  distances: :spherical
)