drolbr / Overpass-API

A database engine to query the OpenStreetMap data.
http://overpass-api.de
GNU Affero General Public License v3.0
716 stars 90 forks source link

foreach: ways accumulating #568

Closed mmd-osm closed 1 year ago

mmd-osm commented 4 years ago

I've noticed that the following query gets slower over time, obviously accumulating ways over time. This is most obvious in Way_Geometry_Store, where an increasing numbers of ways is being prepared.

When running the query for a single FIPS value (e.g. area["nist:fips_code"="16011"];), only a single index and two ways need to be prepared in Way_Geometry_Store, supporting the idea that foreach doesn't clean up some data here.

I believe this is caused by a missing clear() in Area_Constraint::get_ranges which can be fixed by https://github.com/mmd-osm/Overpass-API/commit/debc35743ef549de70fe6e19eee11e370f68cd77 This brings the query runtime down from >2h (query below times out) to 50s.

Testing based on commit 021dfdb822d4b513de8fec684187daa038719840

Query

[timeout:7200]
[out:csv("nist:fips_code", fips, name, total)];

// All counties with FIPS 6-4 county codes
area["nist:fips_code"~"^[0-9]{5}$"];

// Count the pharmacies in each county
foreach->.county(
  .county out tags;
  // Collect all matching features in the current county
  way["amenity"="fuel"](area.county);

  // Sum nodes, ways, and relations together and group by county
  make count fips = county.set(t["nist:fips_code"]),
             name = county.set(t["name"]),
             total = count(nodes) + count(ways) + count(relations);
//  out;
)

Logging

Adding small instrumentation to show the number of indices and ways whenn calling the Way_Geometry_Store constructor:

diff --git a/src/overpass_api/data/way_geometry_store.cc b/src/overpass_api/data/way_geometry_store.cc
index 75d6de37..b8f1a499 100644
--- a/src/overpass_api/data/way_geometry_store.cc
+++ b/src/overpass_api/data/way_geometry_store.cc
@@ -92,6 +92,9 @@ std::map< Uint32_Index, std::vector< Node_Skeleton > > small_way_members
 Way_Geometry_Store::Way_Geometry_Store
     (const std::map< Uint31_Index, std::vector< Way_Skeleton > >& ways, const Statement& query, Resource_Manager& rman)
 {
+long total = 0;
+for (const auto &r: ways) total+=r.second.size();
+std::cout << "Constructor called with #ways: " << total << " in " << ways.size() <<" indices\n";
   // Retrieve all nodes referred by the ways.
   std::map< Uint32_Index, std::vector< Node_Skeleton > > way_members_ = small_way_members(&query, rman, ways);

Output

Number of ways and indices keeps increasing over time:

nist:fips_code  fips    name    total
02016       Aleutians West Census Area  
Constructor called with #ways: 2 in 2 indices
Constructor called with #ways: 0 in 0 indices
60010       Eastern District    
Constructor called with #ways: 4 in 3 indices
Constructor called with #ways: 0 in 0 indices
60050       Western District    
Constructor called with #ways: 4 in 3 indices
Constructor called with #ways: 0 in 0 indices
60020       Manu'a District 
Constructor called with #ways: 4 in 3 indices
Constructor called with #ways: 0 in 0 indices
60030       Rose Atoll  
Constructor called with #ways: 4 in 3 indices
Constructor called with #ways: 0 in 0 indices
02185       North Slope 
Constructor called with #ways: 6 in 5 indices
Constructor called with #ways: 0 in 0 indices
15003       Honolulu County 
Constructor called with #ways: 51 in 41 indices
Constructor called with #ways: 0 in 0 indices
02013       Aleutians East  
Constructor called with #ways: 56 in 44 indices
Constructor called with #ways: 0 in 0 indices
02050       Bethel  
[...]
Constructor called with #ways: 12689 in 10100 indices
Constructor called with #ways: 0 in 0 indices
47105       Loudon County   
Constructor called with #ways: 12689 in 10100 indices
Constructor called with #ways: 0 in 0 indices
47123       Monroe County   
Constructor called with #ways: 12689 in 10100 indices
Constructor called with #ways: 0 in 0 indices
37039       Cherokee County 
Constructor called with #ways: 12692 in 10103 indices
Constructor called with #ways: 0 in 0 indices
37043       Clay County 
Constructor called with #ways: 12692 in 10103 indices
Constructor called with #ways: 0 in 0 indices
37075       Graham County   
Constructor called with #ways: 12692 in 10103 indices
Constructor called with #ways: 0 in 0 indices
37173       Swain County    
Constructor called with #ways: 12693 in 10104 indices
Constructor called with #ways: 0 in 0 indices
47133       Overton County  
Constructor called with #ways: 12693 in 10104 indices
Constructor called with #ways: 0
[...]
mmd-osm commented 1 year ago

Issue disappeared during large scale refactoring in 0.7.58.