queries are much slower than native ruby (even including pre-fetch)?

pik commented 9 years ago

Hello - apologies for the generic issue title, hopefully I can edit it to be more specific once I'm a little more clear on the source of this. We are using the pg_search gem for an auto-complete and I'm somewhat puzzled by it's performance (and this may be Postgres rather than pg_search related, more on that below).

Here is a query generated by the pg_search gem (and explain analyze output below):

EXPLAIN ANALYZE 
SELECT  "variants".id, "variants".search,
((ts_rank(("variants"."search"), 
(to_tsquery('simple', ''' ' || 'black' || ' ''' || ':*')), 0))) 
AS pg_search_rank FROM "variants" 
INNER JOIN "products" ON "products"."id" = "variants"."product_id" 
WHERE "variants"."account_id" = $1 
AND ((("variants"."search") @@ (to_tsquery('simple', ''' ' || 'black' || ' ''' || ':*')))) 
AND ("variants"."status" != 'archived')  
ORDER BY pg_search_rank DESC, "variants"."id" ASC, "variants"."id" 
DESC LIMIT 20 OFFSET 0;

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=165332.42..165332.47 rows=20 width=167) (actual time=1686.098..1686.103 rows=20 loops=1)
   ->  Sort  (cost=165332.42..165386.88 rows=21784 width=167) (actual time=1686.096..1686.099 rows=20 loops=1)
         Sort Key: (ts_rank(variants.search, '''black'':*'::tsquery, 0)), variants.id
         Sort Method: top-N heapsort  Memory: 30kB
         ->  Nested Loop  (cost=382.29..164752.75 rows=21784 width=167) (actual time=354.103..1626.966 rows=128052 loops=1)
               ->  Bitmap Heap Scan on variants  (cost=381.86..98421.65 rows=21784 width=171) (actual time=354.079..1130.804 rows=128052 loops=1)
                     Recheck Cond: ((account_id = $1) AND (search @@ '''black'':*'::tsquery))
                     Rows Removed by Index Recheck: 501447
                     Filter: ((status)::text <> 'archived'::text)
                     Heap Blocks: exact=36190 lossy=53808
                     ->  Bitmap Index Scan on index_variants_on_search_and_account_id  (cost=0.00..376.42 rows=28042 width=0) (actual time=343.227..343.227 rows=223097 loops=1)
                           Index Cond: ((account_id = $1) AND (search @@ '''black'':*'::tsquery))
               ->  Index Only Scan using families_pkey on products  (cost=0.43..3.03 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=128052)
                     Index Cond: (id = variants.product_id)
                     Heap Fetches: 0
 Planning time: 1.169 ms
 Execution time: 1686.219 ms
(17 rows)

The execution time is absolutely staggering. Here is what this looks like if we pre-fetch the objects in native ruby:

EXPLAIN ANALYZE SELECT id, search FROM variants where account_id = $1;

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on variants  (cost=2686.62..658427.87 rows=289370 width=167) (actual time=133.572..1139.615 rows=291264 loops=1)
   Recheck Cond: (account_id = $1)
   Rows Removed by Index Recheck: 616695
   Heap Blocks: exact=41819 lossy=80014
   ->  Bitmap Index Scan on index_variants_on_search_and_account_id  (cost=0.00..2614.27 rows=289370 width=0) (actual time=122.186..122.186 rows=291264 loops=1)
         Index Cond: (account_id = $1)
 Planning time: 0.226 ms
 Execution time: 1158.538 ms
(8 rows)

res = con.exec("select id,search from variants where account_id = $1").entries
[7] pry(main)> res.size
=> 291264

Benchmark.ips do |x|
    x.report do
        res.select { |i| i['search'].include?('black') }
    end
end

Calculating -------------------------------------
                         1.000  i/100ms
-------------------------------------------------
                          8.601  (±11.6%) i/s -     41.000
=> #<Benchmark::IPS::Report:0x007fa9a5b7bbb8
 @data=nil,
 @entries=[#<Benchmark::IPS::Report::Entry:0x007fa9369c7f20 @ips=8.600967724341794, @ips_sd=1, @iterations=41, @label="", @measurement_cycle=1, @microseconds=5050743.103027344, @show_total_time=false>]>

In other-words Ruby .include? will run at ~100ms on pre-fetched objects once they are loaded into memory (~70ms in python). Where as the native SQL never goes under 1000ms. To see if this can be made faster (perhaps 'pg_search' is building suboptimal indexes or queries) - I've tried re-indexing with both gin and gist, as well as building separate indexes for a nested select e.g.:

"index_variants_on_search_and_account_id_as_gin" gin (account_id, search)
"index_variants_on_search_and_account_id_as_gist" gist (account_id, search)
"index_variants_on_account_id" btree (account_id)
"index_variants_on_search" gin (search)

None of these appear to provide any difference that's close to bridging the gap between Postgres query run-time and native Ruby .include? even though the later is a naive O(n**2) lookup.

amarshall commented 9 years ago

Ruby include? will run at ~100ms on pre-fetched objects once they are loaded into memory (~70ms in python).

Though when you include the query time, it’s ~1250ms vs. PgSearch’s query’s ~1680ms. In which case it’s not really too much faster (¾ time, or 1.34x faster), and with that extra time you gain lower memory consumption on the client, less network transfer, and intelligent search rank. Sure if you’ve actually already loaded the entire search space into memory, then yea, it’s going to be faster and the first two cons (memory and network) are mute; but you still give up the rank.

That said…

To see if this can be made faster (perhaps 'pg_search' is building suboptimal indexes or queries) - I've tried re-indexing with both gin and gist, as well as building separate indexes for a nested select

The PgSearch query looks like it’s using the search indexes just fine, the problem may be the condition ("variants"."status" != 'archived'), which doesn’t appear to be indexed. What happens if you remove that from your query? Additionally, the INNER JOIN "products" ON "products"."id" = "variants"."product_id" would also add some time to the query, which doesn’t seem to be needed for the search.

the later is a naive O(n**2) lookup.

It’s not, it’s O(n*m) where n = number of records, m = average field length.

Unless I’m missing something (and I might be, as it’s quite late), I’m inclined to think there isn’t a problem here. Postgres’s full text search is not exactly fast, and FWIW Case Commons migrated away to Elasticsearch primarily for performance, and the difference is substantial with our medium+ datasets.

pik commented 9 years ago

@amarshall Sorry I neglected to include the explain analyze for the stripped down version of the above query:

EXPLAIN ANALYZE SELECT  "variants".id, "variants".search from "variants"
WHERE account_id = $1 
AND ((("variants"."search") @@ (to_tsquery('simple', ''' ' || 'black' || ' ''' || ':*'))));

---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on variants  (cost=383.43..98353.11 rows=28042 width=167) (actual time=336.732..1085.490 rows=128052 loops=1)
   Recheck Cond: ((account_id = $1) AND (search @@ '''black'':*'::tsquery))
   Rows Removed by Index Recheck: 501447
   Heap Blocks: exact=36190 lossy=53808
   ->  Bitmap Index Scan on index_variants_on_search_and_account_id  (cost=0.00..376.42 rows=28042 width=0) (actual time=326.986..326.986 rows=223097 loops=1)
         Index Cond: ((account_id = $1) AND (search @@ '''black'':*'::tsquery))
 Planning time: 0.616 ms
 Execution time: 1093.949 ms
(8 rows)

That will give about the same performance as the naive ruby-fetch (give or take some variance). The problem is that I really expect Postgres Text Search with pre-built indexes to be faster than fetching objects and doing naive string .include? -- e.g. considering that with a web-server the entire data-set for an account can be cached into memory and maybe re-used 50-100 times before it's expired.

How are things working for you with Elasticsearch? We are considering this as an alternative as well -- but I wanted to be sure we weren't missing something essential that would enable us to greatly improve the Postgres performance. Are these numbers within +/- 25% of what you would expect?

Casecommons / pg_search

queries are much slower than native ruby (even including pre-fetch)? #265