publiclab / plots2

a collaborative knowledge-exchange platform in Rails; we welcome first-time contributors! :balloon:
https://publiclab.org
GNU General Public License v3.0
957 stars 1.83k forks source link

Research why loading nearbyPeople Srch API is so slow to load #7556

Open nstjean opened 4 years ago

nstjean commented 4 years ago

In our maps we are now loading data from https://publiclab.org/api/srch/nearbyPeople?nwlat=10.0&selat=0.0&nwlng=0.0&selng=10.0 This code is located here: https://github.com/publiclab/plots2/blob/master/app/api/srch/search.rb#L333

Accessing this API takes a lot longer than it should. For example, it takes 19.43s to load:

FireShot Capture 304 - API Tester - apitester com

Meanwhile another API we use in our map layers has far more results in only 83ms!

FireShot Capture 305 - API Tester - apitester com

Some of our other Search API requests perform much better than nearbyPeople. Why is that particular one so slow? In order for our maps to be useful we can't have such a long delay in loading every time we refresh or move the map.

nstjean commented 4 years ago

I suspect that we are going to have to use less search params in order to get the speeds we want. The problem is I'm not sure how we can test those params without pushing and merging the changes to the live site, and that's going to be a really slow iteration. Anyone want to look into this?

dburugupalli commented 4 years ago

@nstjean i could like to try this one. I think the issue could be because of the greater number of parameters to be processed. Due to which, it takes a larger amount of time to load and process the data.

nstjean commented 4 years ago

@dburugupalli Go for it!

sagarpreet-chadha commented 4 years ago

Yes we need to decrease the time taken by that query. Thanks!

jywarren commented 4 years ago

@dburugupalli my first suggestion is to try adding a limit to this. we should try to trace the query -- adding more info here:

https://github.com/publiclab/plots2/blob/783fbe2161bd70e13ad5e62ac6ab9254de3661b5/app/services/search_service.rb#L187-L190

This is the query, which i suspect runs real slow on the whole huge users table. We can limit with an additional:

.where(status: 1)

which will help a lot. We could also join the nodes table and insist that the user has a node which has node.status == 1 too. That'd help a lot too.

Yes, here is the Skylight report on this query:

image

Full report here: https://oss.skylight.io/app/applications/GZDPChmcfm1Q/1583245980/6h/endpoints/Srch::Search%20%5BGET%5D%20srch%2FnearbyPeople

That actually says the slow query is the user tags query...

So maybe when we run this, it is running on a TON of records:

https://github.com/publiclab/plots2/blob/783fbe2161bd70e13ad5e62ac6ab9254de3661b5/app/services/search_service.rb#L203-L208

We could limit it to 100

Also, are we re-running a user_tag query on each instance, on this line?

      item.user_tags.none? do |user_tag|

If so, we could optimize this.

jywarren commented 4 years ago

I think this'll take a little experimentation but if that is a nested query on item.user_tags.none? we should be able to speed this up a LOT by a) limiting the results and b) not nesting the query, but, inspecting the item and seeing what we can do to test if it has any tags without generating a new nested query.

jywarren commented 4 years ago

Looking at whether we need to add an index, or reduce recursion in query nesting...

image

jywarren commented 4 years ago

Hmm. user_tags has no indices: https://github.com/publiclab/plots2/blob/15589c95b54759b2828ea0d9b3571d2bebf8c2dc/db/schema.rb.example#L417-L422

Not sure if it needs them? Checking # of records now.

jywarren commented 4 years ago

Primary key ought to auto-index...

So, i ran it in the production console and it's really fast: UserTag.find_by(uid: 1)

So, i think it's the nesting - i bet we are running it on a ton of users.

SidharthBansal commented 4 years ago

https://github.com/publiclab/plots2/blob/783fbe2161bd70e13ad5e62ac6ab9254de3661b5/app/services/search_service.rb#L197 and https://github.com/publiclab/plots2/blob/783fbe2161bd70e13ad5e62ac6ab9254de3661b5/app/services/search_service.rb#L176 have User.where('rusers.status <> 0') .joins(:user_tags) same statements running twice! Can we optimize here by storing them ?

SidharthBansal commented 4 years ago

shouldn't there be status<>0 clause at user in https://github.com/publiclab/plots2/blob/783fbe2161bd70e13ad5e62ac6ab9254de3661b5/app/services/search_service.rb#L187 ? I think there should be !

SidharthBansal commented 4 years ago

Two possible ways to reduce time are short circuiting if possible, AND rearrangement of statements so that the first clause eliminates more cases then the upcoming clauses in sql statements.

jywarren commented 4 years ago

I think the first and second referenced in your comment are the latitude and longitude calcs I want to combine.

https://blog.nrowegt.com/rails-pg-ambiguous-column-reference/

jywarren commented 4 years ago

OK, i was able to get this to work:

u = User.where('rusers.status <> 0').joins(:user_tags).where('user_tags.value LIKE ?', 'lat%').where('REPLACE(user_tags.value, "lat:", "") BETWEEN 44 AND 44.1').joins('INNER JOIN `user_tags` AS `lontags` ON lontags.uid = rusers.id').where('lontags.value LIKE ?', 'lon%').where('REPLACE(lontags.value, "lon:", "") BETWEEN 20 AND 80')

Formatted:

u = User.where('rusers.status <> 0')
.joins(:user_tags)
.where('user_tags.value LIKE ?', 'lat%')
.where('REPLACE(user_tags.value, "lat:", "") BETWEEN 44 AND 44.1')
.joins('INNER JOIN `user_tags` AS `lontags` ON lontags.uid = rusers.id')
.where('lontags.value LIKE ?', 'lon%')
.where('REPLACE(lontags.value, "lon:", "") BETWEEN 20 AND 80')
jywarren commented 4 years ago

Hmm. @cesswairimu actually the block I'm still not sure about is:

      item.user_tags.none? do |user_tag|
        user_tag.name == "location:blurred"
      end

What's going on here?

jywarren commented 4 years ago

i think i found that the block is used to evaluate all items of the array and the whole thing returns true only if none of the array members return true? https://ruby-doc.org/core-2.7.0/Enumerable.html#method-i-none-3F

jywarren commented 4 years ago

OK, i added .limit(limit) in a couple places too. I hope this helps!!!

SidharthBansal commented 4 years ago

https://github.com/publiclab/plots2/issues/7556#issuecomment-610625735

Could the replace statements move towards end? After sorting, we will have lesser number of rows so replacing at last step can improve(I don't have full context, so it may be wrong)

jywarren commented 4 years ago

https://github.com/publiclab/plots2/pull/7795 is now merged, although one of the .limit() statements had to be removed. We should re-test this after today's live site republication and see if performance improved?

Current:

https://oss.skylight.io/app/applications/GZDPChmcfm1Q/1586809440/1d/endpoints/Srch::Search%20%5BGET%5D%20srch%2FnearbyPeople

image

jywarren commented 4 years ago

Now not seeing much more than 8s, much better than 60!!!

https://oss.skylight.io/app/applications/GZDPChmcfm1Q/1586902080/30m/endpoints/Srch::Search%20%5BGET%5D%20srch%2FnearbyPeople

but needs more testing.

jywarren commented 4 years ago

Test it out at this zoom level: https://publiclab.org/map#9/40.71372914097432/-73.47581131383778

cc zoom level optimizations: https://github.com/publiclab/plots2/issues/7822

jywarren commented 4 years ago

OK, so i'm not seeing a huge difference between before and after April 14; both are averaging around 17 seconds or so, and both have outliers up to more than a minute of load time, so not good. But even before the 14th, the "what time is spent on" chart is quite different for different periods.

We just don't have a huge amount of traffic so statistics aren't great, but I'll screenshot a few "moments" so we can look at different possible sources of inefficiency:

https://oss.skylight.io/app/applications/GZDPChmcfm1Q/1586919420/1d/endpoints/Srch::Search%20%5BGET%5D%20srch%2FnearbyPeople

image

this has 3 queries:

SELECT DISTINCT `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON `user_tags`.`uid` = `rusers`.`id` WHERE (rusers.status <> ?) AND (value LIKE ?) AND (REPLACE(value, ?, ?) BETWEEN ? AND ?)
SELECT `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON `user_tags`.`uid` = `rusers`.`id` WHERE (rusers.status <> ?) AND (rusers.id IN ?) AND (user_tags.value LIKE ?) AND (REPLACE(user_tags.value, ?, ?) BETWEEN ? AND ?)
SELECT  `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON `user_tags`.`uid` = `rusers`.`id` WHERE (rusers.status <> ?) AND (rusers.id IN ?) AND (user_tags.value LIKE ?) AND (REPLACE(user_tags.value, ?, ?) BETWEEN ? AND ?) ORDER BY created_at DESC LIMIT ?

That was all just one query, for the period of time before the April 14th optimization.

This next one was after the optimization:

image

The query was more simple:

SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = ?

It does seem that most slowness after April 14th was to do with user_tags for a single user... that's weird.

Could it be that for large sets of users, this is just these lines running over and over?

https://github.com/publiclab/plots2/blob/c6aabd6727ce17a55ac6ecbd428c8c7c5145ab80/app/services/search_service.rb#L204-L206

If true (and we should be able to test that on the console?) we should be able to do this filtering a different way, as we should already have loaded the user tags and be able to additionally sort by them. But maybe we haven't eager-loaded the associations?

Could we a) do this in 2 batches, one for users with a location:blurred tag and once without, and/or b) eager-load the user tags so we can filter without doing individual queries for each user?

emilyashley commented 4 years ago

Also, pssst @Tlazypanda the staff and I were talking about this today and I thought you may also want to put eyes on it since it is performance related! :)

Tlazypanda commented 4 years ago

@emilyashley Yeah sure! Would need a bit of context on this first though :sweat_smile: Will look into it :smile:

jywarren commented 3 years ago

Hi @daemon1024 -- i wondered if this might be one you'd like to take up in your project, if you end up having time. It may have a big impact on usability. Thanks!

daemon1024 commented 3 years ago

Hey @jywarren, Definitely. I actually mentioned benchmarking the People API in my proposal and this gives a lot of relevant context as well.

jywarren commented 3 years ago

Hi @daemon1024 - I am looking at this next and I think @Manasa2850 and @17sushmita may have some insights into this as it's a query heavily based on the tagging system.

I'd say we should start with:

  1. researching the causes for slowness
  2. some brainstorming around ways to improve it: caching? query optimization? other options?

The whole query is in this method:

https://github.com/publiclab/plots2/blob/6832a437ed2a6451d6bc30bb9876906a9ec92eb8/app/services/search_service.rb#L164-L239

17sushmita commented 3 years ago

This seems to be an interesting issue 🤔 . Looking into this if I can find anything.

daemon1024 commented 3 years ago

So the problematic queries are still the same

SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = ?

and

SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = ? AND (value LIKE ?)

The second one is quite similar to first one so bound to be problematic :P

The ? here denotes that we are looping and repeatedly making these queries for different uids.

Skylight suggests that we eager load things and predefine assosciations we want so as the generated query is not repetitive

- SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = ?
+ SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` IN ( some raange here ..... )

Ref : https://www.skylight.io/support/performance-tips#solving-the-n1-query-problem

I believe the problematic part is https://github.com/publiclab/plots2/blob/6832a437ed2a6451d6bc30bb9876906a9ec92eb8/app/services/search_service.rb#L206-L211 Since that's the only part we are looping I think. I am not really sure though.

daemon1024 commented 3 years ago

Here's a guide to how we can eager load things https://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations

https://github.com/publiclab/plots2/blob/6832a437ed2a6451d6bc30bb9876906a9ec92eb8/app/services/search_service.rb#L201

I believe we can modify this line to be something like

 user_locations = user_locations.includes(:user_tag).where('rusers.id IN (?)', uids).distinct 

What do you think about this?

To confirm our hypothesis maybe we can just choose to comment out blurred locations code and see our performance.

jywarren commented 3 years ago

Let's try eager loading, push it to unstable, and try it out, timing things in CURL for now? It's a little more primitive but could work:

https://www.badunetworks.com/performance-testing-curl-part-1-basics/

Thank you!!!

jywarren commented 3 years ago

I think we'll probably have to try a few different measures to reduce the load time. We could also filter results more if it's raw database size that's causing trouble. But reducing nested queries may help a lot. I like the idea of testing by:

To confirm our hypothesis maybe we can just choose to comment out blurred locations code and see our performance.

We could push such a test to unstable too in order to test it out.

Just be aware (@)icarito is using unstable a lot these past few days to test out an image upload issue. So if you mention his name in the chatroom you can both use unstable by taking turns or just making the other aware so you don't both try to push to it at the same time.

Thank you!!!

daemon1024 commented 3 years ago

I tested thing out on https://unstable.publiclab.org/api/srch/nearbyPeople?nwlat=55.3791104480105&selat=23.40276490540795&nwlng=-128.67187500000003&selng=178.76953125

Well the improvements are 10x already. From 30+seconds to 2-4seconds

Also tried to the query mentioned in the main issue body i.e. nwlat=10.0&selat=0.0&nwlng=0.0&selng=10.0 seeing improvements on a similar level ( near about 10x )

daemon1024 commented 3 years ago

So the next step would be to try eager loading. This should help in our performance significantly I believe.

I think we'll probably have to try a few different measures to reduce the load time.

Agreed. Will look into further improvements after improving this.

daemon1024 commented 3 years ago

The performance now seems to be regressed back to normal. I think it's not the right way to eager load the query. Will have to look into how to do it the correct way. Any inputs appreciated!

daemon1024 commented 3 years ago

So we have a similar construct in nearbyNodes too. https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L144-L148

And that uses eager loading too. https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L137

History for this points to https://github.com/publiclab/plots2/pull/1987/.

daemon1024 commented 3 years ago

Reverted this commit since it broke things 😅

Tlazypanda commented 3 years ago

Hey @daemon1024 just noting here https://github.com/flyerhzm/bullet you can set this gem up locally to test whether the n+1 problem is getting resolved or not :v: This really helped me a lot to figure out the exact traceback of the n+1 query issue and change my code accordingly. Once the issue is resolved it will also stop generating the warning for n+1

jywarren commented 3 years ago

Hi @daemon1024 can you link us to the PR you're working in? Sorry if I missed that!

daemon1024 commented 3 years ago

@jywarren There's no PR for now 😅. Most of my testing has been in local or directly on unstable. Will make a PR once I have something concrete. Rest noted my observations above.

Thanks @Tlazypanda for the gem, looks very helpful and will get back to you after trying it out 🙌

jywarren commented 3 years ago

Ah i see no problem! - still, good for us to be able to see your code so far, so don't be afraid to share even rough early code!

17sushmita commented 3 years ago

Hi @daemon1024

So the problematic queries are still the same

SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = ?

This one is due to the looping over on every node by the user to check if it has location: blurred.

https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L202-L207 So can be resolved by using a simpler rails query

    items = items.where('user_tags.value <> "location:blurred"')

Second query still needs to be debugged

and

SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = ? AND (value LIKE ?)
17sushmita commented 3 years ago

I'm not really sure about the second query but I observed that when the limit is 5 it performs 20 such queries - 4 query for each record and the 4 queries are in following pattern-

  UserTag Load (29.2ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'lat:%')
  UserTag Load (29.1ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'lon:%')
  UserTag Exists (15.0ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (39.0ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'location:%')

First one for loading tag like 'lat:%' second for loading tag like 'lon:%' and third one for checking if 'location:%' exists or not and fourth for loading the tag like 'location:%' on the same user.

We are only using LIKE 'lat%' and LIKE 'lon%' queries in this rails query- https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L176-L183 and checking if location: blurred exists in this query- https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L202-L207 which after optimizing will look like-

    items = items.where('user_tags.value <> "location:blurred"')

So, I think these queries are automatically performed by rails ORM to verify if returned records have these attributes or not.

Complete trace of queries executed for the following parameters - http://publiclab.org/api/srch/nearbyPeople?nwlat=55.3791104480105&selat=23.40276490540795&nwlng=-128.67187500000003&selng=178.76953125 are (placed dummy user ids in place of original ones) -

  UserTag Load (29.2ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'lat:%')
  UserTag Load (29.1ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'lon:%')
  UserTag Exists (15.0ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (39.0ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'location:%')
  UserTag Load (42.6ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1001 AND (value LIKE 'lat:%')
  UserTag Load (47.8ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1001 AND (value LIKE 'lon:%')
  UserTag Exists (14.7ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1001 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (43.4ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1001 AND (value LIKE 'location:%')
  UserTag Load (39.3ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1002 AND (value LIKE 'lat:%')
  UserTag Load (43.5ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1002 AND (value LIKE 'lon:%')
  UserTag Exists (12.0ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1002 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (47.6ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1002 AND (value LIKE 'location:%')
  UserTag Load (42.4ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1003 AND (value LIKE 'lat:%')
  UserTag Load (40.5ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1003 AND (value LIKE 'lon:%')
  UserTag Exists (14.2ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1003 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (41.4ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1003 AND (value LIKE 'location:%')
  UserTag Load (38.5ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1004 AND (value LIKE 'lat:%')
  UserTag Load (41.5ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1004 AND (value LIKE 'lon:%')
  UserTag Exists (13.7ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1004 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (45.5ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1004 AND (value LIKE 'location:%')

These are some stack overflow questions that I found had some similar issues like us (Not sure)- https://stackoverflow.com/questions/50483971/why-is-ruby-on-rails-executing-select-1-as-one-queries https://stackoverflow.com/questions/53125413/page-loads-unnecessary-queries-ruby-on-rails

daemon1024 commented 3 years ago
Details

image

Order of execution of things I see in skylight.

  1. The first query:

    SELECT DISTINCT `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON `user_tags`.`uid` = `rusers`.`id` INNER JOIN user_tags AS lontags ON lontags.uid = rusers.id WHERE (rusers.status <> ?) AND (user_tags.value LIKE ?) AND (REPLACE(user_tags.value, ?, ?) BETWEEN ? AND ?) AND (lontags.value LIKE ?) AND (REPLACE(lontags.value, ?, ?) BETWEEN ? AND ?)

    which I believe translates from https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L176-L183

  2. The second query:

    SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = ?

    which is already fixed in #9940 🥳

  3. The third query:

    SELECT  DISTINCT `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON `user_tags`.`uid` = `rusers`.`id` INNER JOIN user_tags AS lontags ON lontags.uid = rusers.id WHERE (rusers.status <> ?) AND (user_tags.value LIKE ?) AND (REPLACE(user_tags.value, ?, ?) BETWEEN ? AND ?) AND (lontags.value LIKE ?) AND (REPLACE(lontags.value, ?, ?) BETWEEN ? AND ?) ORDER BY created_at DESC LIMIT ?

    This is where things get interesting. We are kind of repeating our query. with some additional params)(ORDER BY created_at DESC LIMIT ?). And I believe this is caused by https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L232-L234

    So we need to debug our recency sort code?

  4. The consequent queries

    SELECT  ? AS one FROM `user_tags` WHERE `user_tags`.`uid` = ? AND (value LIKE ?) LIMIT ?

    and

    SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = ? AND (value LIKE ?)

    ( the problematic one )

daemon1024 commented 3 years ago

So, I think these queries are automatically performed by rails ORM to verify if returned records have these attributes or not.

Agreed, rails is probably is running it to verify against something we have triggered. I believe it's the ordering code https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L220-L235

What do you think @17sushmita ?

Also can you provide me a full stack trace? Like fuller than you share :P which includes all the queries triggered by this service ( like select distinct ones )

Thank you.

17sushmita commented 3 years ago

There were 4 types of queries executed by this service -

Basically the looping part is breaking our whole single rails query into multiple and so adding redundancy because after looping the previous query is executed again with all the additional queries after the looping.

Full stack trace of original queries triggered by this service before fixing(I have placed ? in place of actual user ids and in the last type of queries to maintain the pattern of 4 queries on the same user I have placed dummy user ids in place of original ones)- https://gist.github.com/17sushmita/54fbb31b064d9f643fe58ddc41d616c4 (pretty long so sharing as gist)

After the change in #9940, the second query is fixed and we have a single query for first, second(fixed one) and third so now we have only two types of queries-

This is the full stack trace of queries executed by the service after making the change. (I have placed some dummy user ids in place of actual user ids.)

  User Load (3763.5ms)  SELECT  DISTINCT `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON `user_tags`.`uid` = 
 `rusers`.`id` INNER JOIN user_tags AS lontags ON lontags.uid = rusers.id WHERE (rusers.status <> 0) AND (user_tags.value 
  LIKE 'lat%') AND (REPLACE(user_tags.value, "lat:", "") BETWEEN 23.40276490540795 AND 55.3791104480105) AND 
  (lontags.value LIKE 'lon%') AND (REPLACE(lontags.value, "lon:", "") BETWEEN -128.67187500000003 AND 178.76953125) 
  AND (user_tags.value <> "location:blurred") ORDER BY created_at DESC LIMIT 5

  UserTag Load (260.6ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'lat:%')
  UserTag Load (50.4ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'lon:%')
  UserTag Exists (26.5ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (63.2ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1000 AND (value LIKE 'location:%')
  UserTag Load (50.0ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1001 AND (value LIKE 'lat:%')
  UserTag Load (55.0ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1001 AND (value LIKE 'lon:%')
  UserTag Exists (24.3ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1001 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (52.8ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1001 AND (value LIKE 'location:%')
  UserTag Load (45.2ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1002 AND (value LIKE 'lat:%')
  UserTag Load (54.7ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1002 AND (value LIKE 'lon:%')
  UserTag Exists (22.6ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1002 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (61.2ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1002 AND (value LIKE 'location:%')
  UserTag Load (53.7ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1003 AND (value LIKE 'lat:%')
  UserTag Load (52.1ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1003 AND (value LIKE 'lon:%')
  UserTag Exists (22.8ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1003 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (53.6ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1003 AND (value LIKE 'location:%')
  UserTag Load (52.9ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1004 AND (value LIKE 'lat:%')
  UserTag Load (57.9ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1004 AND (value LIKE 'lon:%')
  UserTag Exists (19.2ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 1004 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (63.5ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 1004 AND (value LIKE 'location:%')
daemon1024 commented 3 years ago

Are you sure about

This is the full stack trace of queries executed by the service after making the change. (I have placed some dummy user ids in place of actual user ids.)

the trace below this being full?

I think there should be a few other relevant queries above it. Since the trace you are sharing starts with ORDER BY created_at DESC LIMIT 5 and I believe it shouldn't be triggered until the end of the function.

Let me know if I am in the wrong.

Rest, I have understood your point regarding looping.

daemon1024 commented 3 years ago

To confirm my hypothesis, maybe we can comment out https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L218-L235 this part and check the query logs again.

17sushmita commented 3 years ago

the trace below this being full?

Yes, for the parameters in this URL - http://publiclab.org/api/srch/nearbyPeople?nwlat=55.3791104480105&selat=23.40276490540795&nwlng=-128.67187500000003&selng=178.76953125 we are not passing any order direction so it goes to the default i.e else case - https://github.com/publiclab/plots2/blob/b70cbefa443ace9be1e83a5dde56a10e9603a135/app/services/search_service.rb#L231-L234

and I also tried printing the order_direction, it was DESC by default this is why we have ORDER BY created_at DESC

I tried commenting out the sorting part and the current code looks like this -

  def tagNearbyPeople(coordinates, tag, field, period = nil, sort_by = nil, order_direction = nil, limit = 10)
    raise("Must contain all four coordinates") if coordinates["nwlat"].nil?
    raise("Must contain all four coordinates") if coordinates["nwlng"].nil?
    raise("Must contain all four coordinates") if coordinates["selat"].nil?
    raise("Must contain all four coordinates") if coordinates["selng"].nil?

    raise("Must be a float") unless coordinates["nwlat"].is_a? Float
    raise("Must be a float") unless coordinates["nwlng"].is_a? Float
    raise("Must be a float") unless coordinates["selat"].is_a? Float
    raise("Must be a float") unless coordinates["selng"].is_a? Float

    raise("If 'from' is not null, must contain date") if period["from"] && !(period["from"].is_a? Date)
    raise("If 'to' is not null, must contain date") if period["to"] && !(period["to"].is_a? Date)

    user_locations = User.where('rusers.status <> 0')
                         .joins(:user_tags)
                         .where('user_tags.value LIKE ?', 'lat%')
                         .where('REPLACE(user_tags.value, "lat:", "") BETWEEN ' + coordinates["selat"].to_s + ' AND ' + coordinates["nwlat"].to_s)
                         .joins('INNER JOIN user_tags AS lontags ON lontags.uid = rusers.id')
                         .where('lontags.value LIKE ?', 'lon%')
                         .where('REPLACE(lontags.value, "lon:", "") BETWEEN ' + coordinates["nwlng"].to_s + ' AND ' + coordinates["selng"].to_s)
                         .distinct

    if tag.present?
      if field.present? && field == 'node_tag'
        tids = Tag.where("term_data.name = ?", tag).collect(&:tid).uniq || []
        uids = TagSelection.where('tag_selections.tid IN (?)', tids).collect(&:user_id).uniq || []
      else
        uids = User.where('rusers.status <> 0')
                   .joins(:user_tags)
                   .where('user_tags.value = ?', tag)
                   .where(id: user_locations.select("rusers.id"))
                   .limit(limit)
                   .collect(&:id).uniq || []
      end
      user_locations = user_locations.where('rusers.id IN (?)', uids).distinct
    end

    items = user_locations

    # selects the items whose node_tags don't have the location:blurred tag
    items = items.where('user_tags.value <> "location:blurred"')

    # Here we use period["from"] and period["to"] in the query only if they have been specified,
    # so we avoid to join revision table
    if !period["from"].nil? || !period["to"].nil?
      items = items.joins(:revisions).where("node_revisions.status = 1")\
                   .distinct
      items = items.where('node_revisions.timestamp > ' + period["from"].to_time.to_i.to_s) unless period["from"].nil?
      items = items.where('node_revisions.timestamp < ' + period["to"].to_time.to_i.to_s) unless period["to"].nil?
    # ----------------------Modified Code-----------------------------------#
    # # # # added else to return the result and limit to limit the result    
    else
      items.order('created_at DESC').limit(limit)
    end

    # sort users by their recent activities if the sort_by==recent

    # if sort_by == "recent"
    #   items.joins(:revisions).where("node_revisions.status = 1")\
    #        .order("node_revisions.timestamp #{order_direction}")
    #        .distinct
    # elsif sort_by == "content"
    #   ids = items.collect(&:id).uniq || []
    #   User.select('`rusers`.*, count(`node`.uid) AS ord')
    #       .joins(:node)
    #       .where('rusers.id IN (?)', ids)
    #       .group('`node`.`uid`')
    #       .order("ord #{order_direction}")
    # else
    #   items.order("created_at #{order_direction}")
    #         .limit(limit)
    # end
    # ---------------------- End of Modified Code-----------------------------------#

  end

I added the else block because it wasn't returning any data as it didn't go to the if case and we didn't have a default case. Also, without the limit query it resulted all the users from the resulting query (which were too many) so took a really long time (because of the automatic 4 queries on each user). Also, after limiting, resulting users were different from previous ones because we don't have any sorting so the results were in random order

Following is the full query stack trace of the above code-

  User Load (7.9ms)  SELECT  DISTINCT `rusers`.* FROM `rusers` INNER JOIN `user_tags` ON 
  `user_tags`.`uid` = `rusers`.`id` INNER JOIN user_tags AS lontags ON lontags.uid = rusers.id 
  WHERE (rusers.status <> 0) AND (user_tags.value LIKE 'lat%') AND (REPLACE(user_tags.value, 
  "lat:", "") BETWEEN 23.40276490540795 AND 55.3791104480105) AND (lontags.value LIKE 'lon%') 
  AND (REPLACE(lontags.value, "lon:", "") BETWEEN -128.67187500000003 AND 178.76953125) AND 
  (user_tags.value <> "location:blurred") LIMIT 5

  UserTag Load (38.0ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2000 AND (value LIKE 'lat:%')
  UserTag Load (45.2ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2000 AND (value LIKE 'lon:%')
  UserTag Exists (0.4ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 2000 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (41.1ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2000 AND (value LIKE 'location:%')
  UserTag Load (43.9ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2001 AND (value LIKE 'lat:%')
  UserTag Load (33.7ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2001 AND (value LIKE 'lon:%')
  UserTag Exists (1.1ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 2001 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (47.0ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2001 AND (value LIKE 'location:%')
  UserTag Load (38.3ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2002 AND (value LIKE 'lat:%')
  UserTag Load (43.2ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2002 AND (value LIKE 'lon:%')
  UserTag Exists (0.7ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 2002 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (38.1ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2002 AND (value LIKE 'location:%')
  UserTag Load (34.9ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2003 AND (value LIKE 'lat:%')
  UserTag Load (37.4ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2003 AND (value LIKE 'lon:%')
  UserTag Exists (0.9ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 2003 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (47.5ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2003 AND (value LIKE 'location:%')
  UserTag Load (48.0ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2004 AND (value LIKE 'lat:%')
  UserTag Load (37.9ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2004 AND (value LIKE 'lon:%')
  UserTag Exists (0.8ms)  SELECT  1 AS one FROM `user_tags` WHERE `user_tags`.`uid` = 2004 AND (value LIKE 'location:%') LIMIT 1
  UserTag Load (43.5ms)  SELECT `user_tags`.* FROM `user_tags` WHERE `user_tags`.`uid` = 2004 AND (value LIKE 'location:%')
daemon1024 commented 3 years ago

Ohh. Thank You. That makes sense. I will try to look into it once more.