drolbr / Overpass-API

A database engine to query the OpenStreetMap data.
http://overpass-api.de
GNU Affero General Public License v3.0
712 stars 90 forks source link

0.7.55 test results #467

Closed mmd-osm closed 2 years ago

mmd-osm commented 6 years ago

showstopper

Area creation still has really serious performance issues due to eval_set

This is a must fix before releasing 0.7.55 as area creation is pretty much unusable right now.

https://github.com/drolbr/Overpass-API/issues/414

retro

Invisible versions are returned in result

see https://github.com/drolbr/Overpass-API/issues/282#issuecomment-366381776

for

Term "for" is a bit difficult to understand, is something like a "group by" or "split by" based on an evaluator expression.

Steps:

  1. Based on the evaluator statement determine a "bucket" for each entry in the inputset
  2. Store entry in the previously determined bucket
  3. Iterate over each bucket, processing all objects in the respective bucket. The current bucket's runtime value is accessible via a special inputset property val (see Set Key-Value Evaluator)

Test example: count number and total length of all highways in bbox per surface

out:csv(num,key,len)];
way[highway]({{bbox}});
for (t["surface"])
(
  make stat num=count(ways),key=_.val,len=sum(length());
   out; 
);

List of unique street names in an area, with total count of ways and length:

[timeout:300]
[out:csv(num,key,len)];
area[name="Saarland"];
way[highway=residential][name](area);
for (t["name"])
(
  make stat num=count(ways),key=_.val,len=sum(length());
  out;
);

Stats per user in bbox (last editor)

[out:csv(key, total, nodes, ways, relations, len)];
nwr({{bbox}});
for (user())
(
  make stat key = _.val,
            total =  count(nodes) + count(ways) + count(relations),
            nodes = count(nodes),
            ways  = count(ways),
            relations = count(relations),
            len=sum(length());
            out; 
);

Further examples as per https://forum.openstreetmap.org/viewtopic.php?pid=692686#p692686

Recent change: out has to be inside for { } !

nwr

User id filter is ignored

[bbox:{{bbox}}];
nwr(uid:212553);
out 100 geom meta;
node(uid:212553);
out 100 geom meta;

-> nwr(uid:212553) has currently no effect

Test with around

area[name="Bonn"];
node(area)[highway=bus_stop];
nwr(around:100)[amenity=cinema];
out center;

Looks good.

Element dependent evaluators

Global admin boundary relation check for duplicate members

rel[boundary=administrative](if:count_members() != count_distinct_members());
out;
convert rel ::id = id(),
           ::   = ::,
           _count_members          = count_members(),
           _count_distinct_members = count_distinct_members(),
           _count_by_role          = count_by_role("outer"),
           _count_distinct_by_role = count_distinct_by_role("outer") ;

out geom;

One example result out of 6 results: https://www.openstreetmap.org/relation/6259670

Meta data evaluators

version

way[railway=rail]({{bbox}})(if:version()>30);
out geom;
make info versions=set(version());
out;

type

Calculate new object id based on original object type

[bbox:{{bbox}}];

nwr[amenity=recycling](if: t["recycling:glass"] || 
                            t["recycling:paper"] ||
                            t["recycling:clothes"]);

convert nwr
        ::id = id() * (type() == "node" ? 10000000 : 1),
        ::geom = center(geom()),
        :: = ::;
out geom;

Mixed cases

nwr, center, geom, string concatenation, ternary if

[bbox:{{bbox}}];

nwr[amenity=recycling](if: t["recycling:glass"] || 
                           t["recycling:paper"] ||
                           t["recycling:clothes"]);

convert nwr
        ::geom = center(geom()),
        :: = ::,
        url =  ("http://www.tappenbeck.net/osm/maps/icons_recy/") + 
               (t["recycling:glass"] ? "g" : "") +
               (t["recycling:paper"] ? "p" : "") +
               (t["recycling:clothes"] ? "k" : "") +
               (".png")
               ;
out geom;

Named street intersections

Determines street intersections and creates dummy nodes with _names tag, containing all distinct street names at a given intersection.

[timeout:600]
[bbox:{{bbox}}]
;
//area[name="Duisburg"][boundary=administrative][admin_level=6];
way[name]["highway"]["highway"!~"footway|cycleway|path|service|track|steps"];

foreach ->.w {
  node(w.w);

  ( way(bn)
       [name]
       (if:!lrs_in(t["name"],w.set(t["name"])))
       ["highway"]
       ["highway"!~"footway|cycleway|path|service|track|steps"];

     - .w;)->.wd;

  node(w.wd)(w.w);

  foreach{

     way(bn)[name] -> .ways;

     out;      // dummy node for turbo

     convert node  ::id    = id(),
                   ::geom  = geom(),
                   ::      = ::,
                   _names  = ways.set(t["name"]);
     out geom;
  };
};

User rank in area sorted by number of last changed nodes

[out:csv(user, nodes)];
area[name="Saarland"]->.a;
(node(area.a););
for(user()){
   make res ::id  = 100000000 - count(nodes),
            user = _.val,
            nodes = count(nodes);
   (._;.res;)->.res;  
}
.res out;

lrs

Min / Max value

Example

[bbox:{{bbox}}];

way[maxheight][maxheight~";"];
convert demo ::id = id(),
              ::geom = geom(),
              :: = ::,
              _max = lrs_max(t["maxheight"]),
              _min = lrs_min(t["maxheight"]);
out geom;
  <demo id="11662193">
    <vertex lat="51.4813926" lon="-0.0450193"/>
    <vertex lat="51.4814612" lon="-0.0445635"/>
    <tag k="highway" v="unclassified"/>
    <tag k="maxheight" v="2.9m; 9'6&quot;"/>
    <tag k="maxspeed" v="20 mph"/>
    <tag k="name" v="Cold Blow Lane"/>
    <tag k="tunnel" v="yes"/>
    <tag k="_max" v="9'6&quot;"/>
    <tag k="_min" v="2.9m"/>
  </demo>

Multiple values containing specific value

way[maxheight][maxheight~";"](if:lrs_in("3.9", t["maxheight"]));
out geom;

compare

Find out if a user was / is currently involved in mapping highway ways

[diff:"2018-01-01T00:00:00Z"];

way(30292271);
out meta;

compare(delta: (uid() == 344561))  // user() == "FahRadler"
  (
  out meta;
  make bla x = count(ways);
  out meta ;

  );

Result is not clear: Object was deleted in the meantime, however user uid 344561 ("FahRadler") was never involved in this object. Why does it show up in "show_initial"?

<action type="delete">
<old>
  <way id="30292271" version="6" timestamp="2014-01-12T15:50:45Z" changeset="19954616" uid="1418319" user="Wolmatinger">
    <nd ref="319518338"/>
    <nd ref="2570227717"/>
    <nd ref="333949847"/>
    <tag k="highway" v="residential"/>
    <tag k="maxspeed" v="30"/>
    <tag k="name" v="Zur Hartwiese"/>
    <tag k="surface" v="paved"/>
    <tag k="width" v="4"/>
  </way>
</old>
</action>
<action type="show_initial">
  <way id="30292271" version="6" timestamp="2014-01-12T15:50:45Z" changeset="19954616" uid="1418319" user="Wolmatinger">
    <nd ref="319518338"/>
    <nd ref="2570227717"/>
    <nd ref="333949847"/>
    <tag k="highway" v="residential"/>
    <tag k="maxspeed" v="30"/>
    <tag k="name" v="Zur Hartwiese"/>
    <tag k="surface" v="paved"/>
    <tag k="width" v="4"/>
  </way>
</action>
<action type="show_initial">
  <bla id="1">
    <tag k="x" v="1"/>
  </bla>
</action>
<action type="show_final">
  <bla id="2">
    <tag k="x" v="0"/>
  </bla>
</action>

</osm>

local

connections between links for ways not clear

[bbox:{{bbox}}];
way[highway];
local;
out geom;

Result:

  <link id="1">
    <vertex lat="49.4438425" lon="9.2878601"/>
    <vertex lat="49.4437862" lon="9.2878917"/>
    <vertex lat="49.4436511" lon="9.2880148"/>
    <tag k="highway" v="residential"/>
    <tag k="name" v="Schloßstraße"/>
  </link>
  <link id="2">
    <vertex lat="49.4436511" lon="9.2880148"/>
    <vertex lat="49.4435918" lon="9.2878582"/>
    <vertex lat="49.4434027" lon="9.2879309"/>
    <vertex lat="49.4432280" lon="9.2880906"/>
    <vertex lat="49.4431659" lon="9.2881479"/>
    <vertex lat="49.4431795" lon="9.2883409"/>
    <vertex lat="49.4432842" lon="9.2886486"/>
    <vertex lat="49.4433175" lon="9.2887790"/>
    <tag k="highway" v="residential"/>
    <tag k="name" v="Schloßstraße"/>
  </link>

Issue: although link id 1 & 2 belong to the same way, it's not obvious from the data, that there's a connection. Sharing the same lat & lon value is not sufficient in this case.

This is much clearer for relations:

rel[boundary=administrative];
local;
out geom;
  <trigraph id="1">
    <vertex lat="49.4992894" lon="9.4290001"/>
    <tag k="TMC:cid_58:tabcd_1:Class" v="Area"/>
    <tag k="_next" v="2"/>
  </trigraph>
  <trigraph id="2">
    <vertex lat="49.4934251" lon="9.4300524"/>
    <vertex lat="49.4932773" lon="9.4299863"/>
    <tag k="_previous" v="1"/>
    <tag k="admin_centre:postal_code" v="74722"/>
    <tag k="admin_level" v="8"/>
    <tag k="wikipedia" v="de:Buchen (Odenwald)"/>
    <tag k="_next" v="3"/>
  </trigraph>
mmd-osm commented 6 years ago

Derived objects

geometry & around

// does not work (no result)
make node ::geom = pt(49,7);
node(around:100);
out geom;

// works ok
//node(around:100,49,7);
//out geom;
make node ::geom = lstr(pt(49.30562251217971,6.947393417358398), pt(49.279312566741346,6.907224655151367));
node(around:100);
out geom;

--> unfortunately this is not supported, would be a good fit for #418

Added a prototype here: http://overpass-turbo.eu/s/xjy // https://github.com/mmd-osm/Overpass-API/commit/dd9d866c39fe5d79d9cc8b165db6d227add396d6

filter on tags

(
  make demo key = "abcdef";
  make demo key = "blubb";
  make demo key = "hello";
);

out geom;

derived._[key="abcdef"];
out geom;

derived._[key~"de"];
out geom;

-> ok

mmd-osm commented 6 years ago

if( ...)

"Verklebte Landuses"

https://lists.openstreetmap.org/pipermail/talk-de/2018-April/114786.html

Highway and landuse ways share at least 2 common nodes.

[bbox:{{bbox}}];

way[highway];
node(w) -> .highway_nodes;
way(bn.highway_nodes)[landuse];
foreach -> .luway {
  node.highway_nodes(w.luway)->.lunodes;
  way(bn.lunodes)[highway];
  foreach -> .highway{
     node.lunodes(w.highway);
     if(count(nodes) >= 2) {
       way.highway(bn)[highway];
       out geom;
     }
  }
}

Overpass turbo link: http://overpass-turbo.eu/s/xHp

drolbr commented 6 years ago

ad retro: This is fallout from #463. In particular, a request for the date in question would return way version 2 as well. This is damaged data in the database, they underlying code has been fixed.

drolbr commented 6 years ago

ad for: I do understand that the name can be confusing. Yet I do not know any better, and I am not convinced that group_by or split_by give the right mindset. It really is a loop, and an unconditional one. It could agree on something like "forgroup", "gfor", "forexp" or so on, maybe "groupwise". "foreach" comes to mind but is already in use. Please tell me if you have further ideas.

drolbr commented 6 years ago

ad nwr: This has been a user/uid specific bug and has been fixed in ec732ced276c955bc7f621c8294f8ca3d325335b

drolbr commented 6 years ago

ad Global admin boundary relation check for duplicate members: In the meanwhile, this find more relations. All of them do have duplicate members, hence works as expected. Thank you for the query and the example.

drolbr commented 6 years ago

ad Named street intersections: This is a really cool query. In the test area, it works as expected.

drolbr commented 6 years ago

ad compare:

[diff:"2018-01-01T00:00:00Z"];

way(30292271);
out meta;

compare(delta: 0)
{
  out meta;
  make bla x = count(ways);
  out meta ;
}

would have precisely the same effect, without any relation to user "FahrRadler". The expression delta is expected to depend on an object. Thus, its value is undefined for a non-existing object. For the moment, I deemed that the empty string as result for the entire expression would be the best fit, working well for tag comparison (the expected most frequent case).

The difference is visible if one uses the empty string as delta expression:

[diff:"2018-01-01T00:00:00Z"];

way(30292271);
out meta;

compare(delta: "")
{
  out meta;
  make bla x = count(ways);
  out meta ;
}
drolbr commented 6 years ago

ad local: I will not guarantee this feature as stable, thus it is deferred after release. For the uniqueness: almost all vertices are on unique coordinates. If there are indeed two vertices on the same cooridate then they are flagged with a special local id to distinguish them.

drolbr commented 6 years ago

ad "Verklebte Landuses": this is also a very cool idea. Longterm there should be a dedicated operator, probably along with a tool to get the first/last/whatever specific node of a way.

drolbr commented 6 years ago

ad geometry & around: This would be definitely a useful feature. But I see this now as a thing that should be added in addition to #418: While this is dynamic, #418 is static, hence it can be more rigid optimized beforehand than a dynamic structure.

mmd-osm commented 6 years ago

ad nwr: This has been a user/uid specific bug and has been fixed in ec732ce

I'm seeing some issues with this change for the following query:

node(user:chris66);
out count;

This used to run in <20s (with #435 in place), and now takes around 42s (User_Statement::execute is no longer called). As it tries to read all nodes into memory the memory footprint also increased to 2,7 GB.

I think it would be better to have a similar logic in User_Statement::execute(Resource_Manager& rman) like in query.cc with bit pattern matching on the type: if (type & QUERY_WAY) in addition to the current string result_type. Or even simpler:

Replace: if ((result_type == "") || (result_type == "node"))

by if ((result_type == "") || (result_type == "node") || (result_type == "nwr"))

and similar for ways and relations.

Once that's in place, we can switch back to standalone mode for user and uid.

Currently, result_type contains nwr for the query example above, i.e. all of the node / way / relation processing simply gets bypassed in User_Statement::execute. That's the reason why this query doesn't return any result.

mmd-osm commented 6 years ago

Stricter syntax checks for union/difference has caused some issues. In all cases missing semicolons were the culprit. It was correctly documented on the Overpass QL page, but hasn't really been enforced in the past.

Also t[name] is no longer accepted and needs to be written as t["name"].

DaveF63 commented 6 years ago

The addition of nwr is very beneficial. Thanks for adding it.

Will it be implemented as an option for Statistical Counts instead of: total = count(nodes) + count(ways) + count(relations), or is there a conciser way that I'm unaware of?