JustFixNYC / tenants2

The JustFix tenant platform!
https://app.justfix.org/
GNU General Public License v3.0
23 stars 4 forks source link

Improve the performance of data-driven onboarding queries #2042

Open toolness opened 3 years ago

toolness commented 3 years ago

It looks like users have to wait a good 2-3 seconds after typing in their address to get DDO suggestions, which is much slower than say, Who Owns What, where users are shown the results of their query in about 300-500 ms.

I'm not sure how much this affects our day-to-day users, but given that this is the very first step in our funnel, and the fact that speed matters, I suspect it contributes at least a little to drop-offs. I know I've been avoiding using DDO during development because of its lengthy query time, opting instead to just type in URLs manually whenever I want to go to a particular part of the site. In fact, I was so annoyed by its performance while developing DDO that I implemented very basic caching in #781, but that doesn't help with a cold cache, and most users will never see its benefit.

Note that one distinction between the DDO query and the WoW query is that on DDO, geocoding is done on the server-side to support progressive enhancement, whereas in WoW the geocoding is done on the client-side, so it's likely that DDO will never be quite as fast unless we do more than just optimize the DDO SQL query.

That said, it looks like the majority of time is still spent in the SQL query. The following command runs the SQL query on 654 Park Place and prints the amount of time spent on it:

python manage.py shell -c "import timeit; from data_driven_onboarding.schema import *; print(timeit.timeit(lambda: run_ddo_sql_query('3012380016'), number=1))"

From my end, the first run of this takes almost 3 seconds, and subsequent runs take closer to 2 seconds, presumably because the database has done some sort of caching/optimization on its end.

samaratrilling commented 3 years ago

Ooh, this does sound useful! Great opportunity to use EXPLAIN ANALYZE on the ddo query.

samaratrilling commented 3 years ago

My timing of the same query:

dcra python manage.py shell -c "import timeit; from data_driven_onboarding.schema import *; print(timeit.timeit(lambda: run_ddo_sql_query('3012380016'), number=1))"

Creating tenants2_app_run ... done
Connection to database established.
This is version b1210e4c25a1041ce66092405b0a2b919fb8.mod.
1.9685296999996353
samaratrilling commented 3 years ago

Here's the full EXPLAIN ANALYZE results:

Nested Loop Left Join  (cost=324672.11..326372.98 rows=1 width=294) (actual time=1759.517..1763.367 rows=1 loops=1)
  Join Filter: (t.bbl = thv.bbl)
  CTE total_res_units
    ->  Index Scan using pluto_19v2_bbl_idx on pluto_19v2  (cost=0.42..8.45 rows=1 width=26) (actual time=0.025..0.026 rows=1 loops=1)
          Index Cond: (bbl = '3012380016'::bpchar)
  CTE count_of_assoc_bldgs
    ->  GroupAggregate  (cost=60.08..77.58 rows=200 width=88) (actual time=235.859..235.859 rows=1 loops=1)
          Group Key: (CASE WHEN (get_assoc_addrs_from_bbl.bbl IS NOT NULL) THEN '3012380016'::text ELSE '3012380016'::text END)
          ->  Sort  (cost=60.08..62.58 rows=1000 width=120) (actual time=235.700..235.709 rows=172 loops=1)
                Sort Key: (CASE WHEN (get_assoc_addrs_from_bbl.bbl IS NOT NULL) THEN '3012380016'::text ELSE '3012380016'::text END)
                Sort Method: quicksort  Memory: 38kB
                ->  Function Scan on get_assoc_addrs_from_bbl  (cost=0.25..10.25 rows=1000 width=120) (actual time=235.601..235.651 rows=172 loops=1)
  CTE major_boro_of_assoc_bldgs
    ->  Limit  (cost=20.75..20.75 rows=1 width=72) (actual time=236.029..236.030 rows=1 loops=1)
          ->  Sort  (cost=20.75..21.25 rows=200 width=72) (actual time=236.028..236.028 rows=1 loops=1)
                Sort Key: (count(*) FILTER (WHERE (get_assoc_addrs_from_bbl_1.bbl IS NOT NULL))) DESC
                Sort Method: quicksort  Memory: 25kB
                ->  HashAggregate  (cost=17.75..19.75 rows=200 width=72) (actual time=236.022..236.022 rows=2 loops=1)
                      Group Key: CASE WHEN (get_assoc_addrs_from_bbl_1.bbl IS NOT NULL) THEN '3012380016'::text ELSE '3012380016'::text END, get_assoc_addrs_from_bbl_1.boro
                      ->  Function Scan on get_assoc_addrs_from_bbl get_assoc_addrs_from_bbl_1  (cost=0.25..10.25 rows=1000 width=108) (actual time=235.931..235.967 rows=172 loops=1)
  CTE count_hpd_complaints
    ->  GroupAggregate  (cost=0.00..72861.83 rows=88 width=19) (actual time=550.710..550.710 rows=1 loops=1)
          Group Key: hpd_complaints.bbl
          ->  Seq Scan on hpd_complaints  (cost=0.00..72860.51 rows=88 width=15) (actual time=229.645..550.693 rows=3 loops=1)
                Filter: ((receiveddate > '2014-01-01'::date) AND (bbl = '3012380016'::bpchar))
                Rows Removed by Filter: 2386231
  CTE count_hpd_violations
    ->  GroupAggregate  (cost=5.90..772.85 rows=46 width=27) (actual time=0.067..0.067 rows=0 loops=1)
          Group Key: hpd_violations.bbl
          ->  Bitmap Heap Scan on hpd_violations  (cost=5.90..771.93 rows=46 width=17) (actual time=0.066..0.066 rows=0 loops=1)
                Recheck Cond: (bbl = '3012380016'::bpchar)
                Filter: ((violationstatus <> 'Close'::text) AND (novissueddate > '2010-01-01'::date))
                Rows Removed by Filter: 12
                Heap Blocks: exact=8
                ->  Bitmap Index Scan on hpd_violations_bbl_idx  (cost=0.00..5.89 rows=194 width=0) (actual time=0.040..0.040 rows=12 loops=1)
                      Index Cond: (bbl = '3012380016'::bpchar)
  CTE num_days
    ->  Bitmap Heap Scan on hpd_violations hpd_violations_1  (cost=5.90..772.09 rows=63 width=32) (actual time=0.046..0.060 rows=8 loops=1)
          Recheck Cond: (bbl = '3012380016'::bpchar)
          Filter: ((novissueddate > '2010-01-01'::date) AND (currentstatus = 'VIOLATION CLOSED'::text))
          Rows Removed by Filter: 4
          Heap Blocks: exact=8
          ->  Bitmap Index Scan on hpd_violations_bbl_idx  (cost=0.00..5.89 rows=194 width=0) (actual time=0.034..0.034 rows=12 loops=1)
                Index Cond: (bbl = '3012380016'::bpchar)
  CTE avg_wait_time
    ->  HashAggregate  (cost=1.57..2.36 rows=63 width=76) (actual time=0.081..0.081 rows=1 loops=1)
          Group Key: num_days.bbl
          ->  CTE Scan on num_days  (cost=0.00..1.26 rows=63 width=48) (actual time=0.053..0.067 rows=8 loops=1)
  CTE violation_lengths_for_portfolio
    ->  Nested Loop  (cost=18.71..154878.99 rows=1060227 width=53) (actual time=242.339..254.360 rows=4045 loops=1)
          ->  HashAggregate  (cost=12.75..14.75 rows=200 width=44) (actual time=242.220..242.258 rows=172 loops=1)
                Group Key: get_assoc_addrs_from_bbl_2.bbl
                ->  Function Scan on get_assoc_addrs_from_bbl get_assoc_addrs_from_bbl_2  (cost=0.25..10.25 rows=1000 width=44) (actual time=242.135..242.148 rows=172 loops=1)
          ->  Bitmap Heap Scan on hpd_violations hpd_violations_2  (cost=5.96..760.42 rows=65 width=36) (actual time=0.027..0.063 rows=24 loops=172)
                Recheck Cond: (bbl = get_assoc_addrs_from_bbl_2.bbl)
                Filter: ((novissueddate >= '2010-01-01'::date) AND (currentstatus = 'VIOLATION CLOSED'::text))
                Rows Removed by Filter: 23
                Heap Blocks: exact=3965
                ->  Bitmap Index Scan on hpd_violations_bbl_idx  (cost=0.00..5.94 rows=201 width=0) (actual time=0.018..0.018 rows=46 loops=172)
                      Index Cond: (bbl = get_assoc_addrs_from_bbl_2.bbl)
  CTE avg_wait_time_for_portfolio
    ->  HashAggregate  (cost=26505.68..26508.18 rows=200 width=64) (actual time=256.780..256.782 rows=1 loops=1)
          Group Key: violation_lengths_for_portfolio.enteredbbl
          ->  CTE Scan on violation_lengths_for_portfolio  (cost=0.00..21204.54 rows=1060227 width=36) (actual time=242.341..255.732 rows=4045 loops=1)
  CTE complaint_category
    ->  Limit  (cost=67995.14..67995.14 rows=1 width=40) (actual time=474.968..474.970 rows=1 loops=1)
          ->  Sort  (cost=67995.14..67995.53 rows=155 width=40) (actual time=474.968..474.968 rows=1 loops=1)
                Sort Key: (count(*)) DESC
                Sort Method: top-N heapsort  Memory: 25kB
                ->  GroupAggregate  (cost=67990.88..67994.36 rows=155 width=40) (actual time=474.953..474.956 rows=4 loops=1)
                      Group Key: (CASE WHEN ((p_1.majorcategory = 'UNSANITARY CONDITION'::text) OR (p_1.majorcategory = 'GENERAL'::text)) THEN p_1.minorcategory ELSE p_1.majorcategory END)
                      ->  Sort  (cost=67990.88..67991.26 rows=155 width=32) (actual time=474.939..474.940 rows=5 loops=1)
                            Sort Key: (CASE WHEN ((p_1.majorcategory = 'UNSANITARY CONDITION'::text) OR (p_1.majorcategory = 'GENERAL'::text)) THEN p_1.minorcategory ELSE p_1.majorcategory END)
                            Sort Method: quicksort  Memory: 25kB
                            ->  Nested Loop  (cost=0.43..67985.24 rows=155 width=32) (actual time=219.997..474.922 rows=5 loops=1)
                                  ->  Seq Scan on hpd_complaints h  (cost=0.00..66894.93 rows=89 width=4) (actual time=219.966..474.829 rows=3 loops=1)
                                        Filter: (bbl = '3012380016'::bpchar)
                                        Rows Removed by Filter: 2386231
                                  ->  Index Scan using hpd_complaint_problems_complaintid_idx on hpd_complaint_problems p_1  (cost=0.43..12.12 rows=12 width=28) (actual time=0.015..0.015 rows=2 loops=3)
                                        Index Cond: (complaintid = h.complaintid)
  CTE complaint_category_with_bbl
    ->  CTE Scan on complaint_category  (cost=0.00..0.02 rows=1 width=72) (actual time=474.972..474.975 rows=1 loops=1)
  CTE total_hpd_violations
    ->  GroupAggregate  (cost=5.92..773.55 rows=139 width=19) (actual time=0.063..0.063 rows=1 loops=1)
          Group Key: hpd_violations_3.bbl
          ->  Bitmap Heap Scan on hpd_violations hpd_violations_3  (cost=5.92..771.47 rows=139 width=11) (actual time=0.042..0.055 rows=12 loops=1)
                Recheck Cond: (bbl = '3012380016'::bpchar)
                Filter: (novissueddate > '2010-01-01'::date)
                Heap Blocks: exact=8
                ->  Bitmap Index Scan on hpd_violations_bbl_idx  (cost=0.00..5.89 rows=194 width=0) (actual time=0.032..0.032 rows=12 loops=1)
                      Index Cond: (bbl = '3012380016'::bpchar)
  ->  Nested Loop Left Join  (cost=0.32..1696.67 rows=1 width=330) (actual time=1759.447..1763.295 rows=1 loops=1)
        Join Filter: ((t.bbl)::text = mc.bbl)
        ->  Nested Loop Left Join  (cost=0.32..1696.64 rows=1 width=290) (actual time=1284.463..1288.308 rows=1 loops=1)
              Join Filter: ((t.bbl)::text = p.bbl)
              ->  Nested Loop Left Join  (cost=0.32..1689.64 rows=1 width=258) (actual time=1027.672..1031.515 rows=1 loops=1)
                    Join Filter: (t.bbl = w.bbl)
                    ->  Nested Loop Left Join  (cost=0.32..1687.59 rows=1 width=226) (actual time=1027.586..1031.427 rows=1 loops=1)
                          ->  Nested Loop Left Join  (cost=0.03..1679.27 rows=1 width=218) (actual time=1027.566..1031.405 rows=1 loops=1)
                                Join Filter: ((t.bbl)::text = mb.enteredbbl)
                                ->  Nested Loop Left Join  (cost=0.03..1679.24 rows=1 width=178) (actual time=791.523..795.359 rows=1 loops=1)
                                      Join Filter: ((t.bbl)::text = a.enteredbbl)
                                      ->  Nested Loop Left Join  (cost=0.03..1672.24 rows=1 width=122) (actual time=555.652..559.487 rows=1 loops=1)
                                            Join Filter: (t.bbl = hpdv.bbl)
                                            ->  Nested Loop Left Join  (cost=0.03..1670.74 rows=1 width=106) (actual time=555.581..559.416 rows=1 loops=1)
                                                  Join Filter: (t.bbl = hpdc.bbl)
                                                  ->  Hash Right Join  (cost=0.03..1667.88 rows=1 width=98) (actual time=4.844..8.676 rows=1 loops=1)
                                                        Hash Cond: (r.ucbbl = t.bbl)
                                                        ->  Seq Scan on rentstab_summary r  (cost=0.00..1493.61 rows=46461 width=23) (actual time=0.003..4.390 rows=46461 loops=1)
                                                        ->  Hash  (cost=0.02..0.02 rows=1 width=86) (actual time=0.032..0.033 rows=1 loops=1)
                                                              Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                              ->  CTE Scan on total_res_units t  (cost=0.00..0.02 rows=1 width=86) (actual time=0.028..0.029 rows=1 loops=1)
                                                  ->  CTE Scan on count_hpd_complaints hpdc  (cost=0.00..1.76 rows=88 width=52) (actual time=550.715..550.715 rows=1 loops=1)
                                            ->  CTE Scan on count_hpd_violations hpdv  (cost=0.00..0.92 rows=46 width=60) (actual time=0.069..0.069 rows=0 loops=1)
                                      ->  CTE Scan on count_of_assoc_bldgs a  (cost=0.00..4.00 rows=200 width=88) (actual time=235.861..235.863 rows=1 loops=1)
                                ->  CTE Scan on major_boro_of_assoc_bldgs mb  (cost=0.00..0.02 rows=1 width=72) (actual time=236.032..236.034 rows=1 loops=1)
                          ->  Index Scan using rentstab_v2_bbl_idx on rentstab_v2 r2  (cost=0.29..8.31 rows=1 width=19) (actual time=0.016..0.017 rows=1 loops=1)
                                Index Cond: (t.bbl = ucbbl)
                    ->  CTE Scan on avg_wait_time w  (cost=0.00..1.26 rows=63 width=76) (actual time=0.082..0.082 rows=1 loops=1)
              ->  CTE Scan on avg_wait_time_for_portfolio p  (cost=0.00..4.00 rows=200 width=64) (actual time=256.782..256.784 rows=1 loops=1)
        ->  CTE Scan on complaint_category_with_bbl mc  (cost=0.00..0.02 rows=1 width=72) (actual time=474.973..474.976 rows=1 loops=1)
  ->  CTE Scan on total_hpd_violations thv  (cost=0.00..2.78 rows=139 width=52) (actual time=0.064..0.064 rows=1 loops=1)
Planning time: 3.143 ms
Execution time: 1764.132 ms
samaratrilling commented 3 years ago

The biggest candidates I can see for refactoring are 1; complaint category - maybe we can switch the order by desc limit 1 to a max function.

  1. complaint_category_with_bbl - maybe make sure it's grouped by bbl or there's an index on bbl
  2. obvi we're doing a lot of left joins at the end - maybe we should make a DDO table with an index on bbl that just has all this info?
toolness commented 3 years ago

Nice detective work, thanks @samaratrilling! Here's some thoughts:

  1. As for adding indexes, it depends on the table. I think all tables except for wow_bldgs are created by NYCDB, so adding indexes on them needs to be done over on that project--see https://github.com/nycdb/nycdb/pull/101 for an example.

  2. Making a DDO table isn't a bad idea, though there's the question of where the SQL code for that table should go. The easy thing to do would be to put it over in the WoW codebase, but that feels very weird if WoW never actually uses that table for anything itself. On the other hand, adding the creation of the DDO table to tenants2's codebase would mean importing it from our nycdb-k8s-loader project and running its code on a regular basis, just like we currently run WoW's table-creating code, which might be a bit of a hassle.

samaratrilling commented 3 years ago

Added an index on bbl on the hpd_complaints table over on the nycdb repo.

samaratrilling commented 3 years ago

To do: add the new stats on here before closing the issue.