mysociety / mapit

A web service to map postcodes to administrative boundaries and more
Other
269 stars 88 forks source link

Store subdivided areas #406

Closed dracos closed 10 months ago

dracos commented 1 year ago

This creates a table containing the same areas as the Geometry table, but split using ST_Subdivide. This is updated on save, and is used for postcode/point lookups, as well as geometry intersection. This makes things quicker:

mapitcopy=# explain analyze
SELECT *
FROM "mapit_area"
    INNER JOIN "mapit_geometry" ON ("mapit_area"."id" = "mapit_geometry"."area_id")
    INNER JOIN "mapit_geometrysubdivided" ON ("mapit_geometry"."id" = "mapit_geometrysubdivided"."geometry_id")
WHERE ("mapit_area"."generation_high_id" >= 50
        AND "mapit_area"."generation_low_id" <= 50
        AND ST_Contains("mapit_geometrysubdivided"."division", ST_Transform((select location from mapit_postcode where postcode='EH11BB'), 27700)));
                                                                             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1081.19..48659.65 rows=73 width=16290) (actual time=0.738..27.903 rows=9 loops=1)
   Workers Planned: 1
   Params Evaluated: $0
   Workers Launched: 1
   InitPlan 1 (returns $0)
     ->  Index Scan using postcodes_postcode_postcode_key on mapit_postcode  (cost=0.43..8.45 rows=1 width=32) (actual time=0.029..0.030 rows=1 loops=1)
           Index Cond: ((postcode)::text = 'EH11BB'::text)
   ->  Nested Loop  (cost=72.75..47643.91 rows=43 width=16290) (actual time=0.157..0.265 rows=4 loops=2)
         ->  Nested Loop  (cost=72.33..47604.39 rows=83 width=16250) (actual time=0.113..0.212 rows=7 loops=2)
               ->  Parallel Bitmap Heap Scan on mapit_geometrysubdivided  (cost=71.91..46941.75 rows=83 width=2058) (actual time=0.102..0.145 rows=7 loops=2)
                     Filter: st_contains(division, st_transform($0, 27700))
                     Rows Removed by Filter: 2
                     Heap Blocks: exact=17
                     ->  Bitmap Index Scan on mapit_geometrysubdivided_division_id  (cost=0.00..71.87 rows=1411 width=0) (actual time=0.150..0.150 rows=17 loops=1)
                           Index Cond: (division ~ st_transform($0, 27700))
               ->  Index Scan using areas_geometry_pkey on mapit_geometry  (cost=0.42..7.98 rows=1 width=14192) (actual time=0.009..0.009 rows=1 loops=14)
                     Index Cond: (id = mapit_geometrysubdivided.geometry_id)
         ->  Index Scan using areas_area_pkey on mapit_area  (cost=0.42..0.48 rows=1 width=40) (actual time=0.007..0.007 rows=1 loops=14)
               Index Cond: (id = mapit_geometry.area_id)
               Filter: ((generation_high_id >= 50) AND (generation_low_id <= 50))
               Rows Removed by Filter: 0
 Planning Time: 1.212 ms
 Execution Time: 28.358 ms
(23 rows)

mapitcopy=# explain analyze
SELECT *
FROM "mapit_area"
    INNER JOIN "mapit_geometry" ON ("mapit_area"."id" = "mapit_geometry"."area_id")
WHERE ("mapit_area"."generation_high_id" >= 50
        AND "mapit_area"."generation_low_id" <= 50
        AND ST_Contains("mapit_geometry"."polygon", ST_Transform((select location from mapit_postcode where postcode='EH11BB'), 27700)));
                                                                         QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=34.15..10465.46 rows=10 width=14232) (actual time=0.997..66.033 rows=9 loops=1)
   InitPlan 1 (returns $0)
     ->  Index Scan using postcodes_postcode_postcode_key on mapit_postcode  (cost=0.43..8.45 rows=1 width=32) (actual time=0.031..0.032 rows=1 loops=1)
           Index Cond: ((postcode)::text = 'EH11BB'::text)
   ->  Index Scan using areas_geometry_polygon_id on mapit_geometry  (cost=25.28..10296.61 rows=19 width=14192) (actual time=0.207..65.916 rows=14 loops=1)
         Index Cond: (polygon ~ st_transform($0, 27700))
         Filter: st_contains(polygon, st_transform($0, 27700))
         Rows Removed by Filter: 12
   ->  Index Scan using areas_area_pkey on mapit_area  (cost=0.42..8.44 rows=1 width=40) (actual time=0.007..0.007 rows=1 loops=14)
         Index Cond: (id = mapit_geometry.area_id)
         Filter: ((generation_high_id >= 50) AND (generation_low_id <= 50))
         Rows Removed by Filter: 0
 Planning Time: 0.569 ms
 Execution Time: 66.155 ms
(14 rows)
Below merged: Also I spotted an st_collect that doesn't seem necessary and actually makes things slower: ``` mapit=# explain analyze WITH target AS MATERIALIZED ( SELECT ST_Transform( (select st_collect(polygon) from mapit_geometry where area_id=2643 group by area_id), 4326) AS division ) SELECT "mapit_postcode"."id", "mapit_postcode"."postcode", "mapit_postcode"."location"::bytea FROM mapit_postcode, target WHERE ST_CoveredBy(location, target.division) LIMIT 1 ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=2318.48..2346.47 rows=1 width=43) (actual time=1478.528..1478.534 rows=1 loops=1) CTE target -> Result (cost=2217.96..2242.97 rows=1 width=32) (actual time=503.701..503.704 rows=1 loops=1) InitPlan 1 (returns $0) -> GroupAggregate (cost=0.42..2217.96 rows=466 width=36) (actual time=23.618..23.621 rows=1 loops=1) Group Key: mapit_geometry.area_id -> Index Scan using mapit_geometry_area_id on mapit_geometry (cost=0.42..1571.26 rows=467 width=14030) (actual time=0.024..0.983 rows=508 loops=1) Index Cond: (area_id = 2643) -> Nested Loop (cost=75.51..56226.67 rows=2006 width=43) (actual time=1478.526..1478.528 rows=1 loops=1) -> CTE Scan on target (cost=0.00..0.02 rows=1 width=32) (actual time=512.482..512.482 rows=1 loops=1) -> Bitmap Heap Scan on mapit_postcode (cost=75.51..55973.89 rows=201 width=43) (actual time=966.017..966.018 rows=1 loops=1) Filter: st_coveredby(location, target.division) Rows Removed by Filter: 309 Heap Blocks: exact=42 -> Bitmap Index Scan on postcodes_postcode_location_id (cost=0.00..75.46 rows=2006 width=0) (actual time=1.965..1.965 rows=14915 loops=1) Index Cond: (location @ target.division) Planning Time: 0.272 ms Execution Time: 1482.774 ms (18 rows) mapit=# explain analyze WITH target AS MATERIALIZED ( SELECT ST_Transform(mapit_geometry.polygon, 4326) AS division FROM mapit_geometry WHERE area_id = 2643 ) SELECT "mapit_postcode"."id", "mapit_postcode"."postcode", "mapit_postcode"."location"::bytea FROM mapit_postcode, target WHERE ST_CoveredBy(location, target.division) LIMIT 1 ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=13246.67..13271.92 rows=1 width=43) (actual time=20.491..20.493 rows=1 loops=1) CTE target -> Index Scan using mapit_geometry_area_id on mapit_geometry (cost=0.42..13246.26 rows=467 width=32) (actual time=15.025..19.839 rows=21 loops=1) Index Cond: (area_id = 2643) -> Nested Loop (cost=0.41..23647896.06 rows=936571 width=43) (actual time=20.489..20.489 rows=1 loops=1) -> CTE Scan on target (cost=0.00..9.34 rows=467 width=32) (actual time=15.030..19.878 rows=21 loops=1) -> Index Scan using postcodes_postcode_location_id on mapit_postcode (cost=0.41..50385.17 rows=201 width=43) (actual time=0.028..0.028 rows=0 loops=21) Index Cond: (location @ target.division) Filter: st_coveredby(location, target.division) Planning Time: 0.265 ms Execution Time: 22.019 ms (11 rows) ```
codecov[bot] commented 1 year ago

Codecov Report

Merging #406 (13486c3) into master (f24a55c) will increase coverage by 0.17%. Report is 1 commits behind head on master. The diff coverage is 93.10%.

:exclamation: Current head 13486c3 differs from pull request most recent head 1536fe9. Consider uploading reports for the commit 1536fe9 to get more accurate results

@@            Coverage Diff             @@
##           master     #406      +/-   ##
==========================================
+ Coverage   74.66%   74.83%   +0.17%     
==========================================
  Files          45       47       +2     
  Lines        2759     2782      +23     
==========================================
+ Hits         2060     2082      +22     
- Misses        699      700       +1     
Files Coverage Δ
mapit/migrations/0005_geometrysubdivided.py 100.00% <100.00%> (ø)
mapit/migrations/0006_auto_20230523_0940.py 100.00% <100.00%> (ø)
mapit/models.py 86.02% <93.33%> (+0.31%) :arrow_up:
mapit/views/areas.py 72.53% <66.66%> (ø)