systemed / tilemaker

Make OpenStreetMap vector tiles without the stack
https://tilemaker.org/
Other
1.42k stars 228 forks source link

Nested relations via post-scan phase #638

Closed systemed closed 7 months ago

systemed commented 7 months ago

With #632 we have support for nodes in relations and for relation roles. This completes the missing functionality by supporting nested relations.

Nested relations are quite niche but important for a few use cases, particularly cycling and walking routes. Because each relation type is sui generis, it's difficult to provide an interface which caters for all use cases while remaining performant. This PR provides two (related) mechanisms:

Relation post-scan

The purpose of relation_postscan_function is to enable tags from parent relations to "bounce down" to their children. Consider this common case:

Previously, tilemaker's way_function could only see the relation that the ways were a direct member of, i.e. 12345. We could potentially expand it so that NextRelation within way_function iterated over parent relations too, but this would mean repeating the same processing for potentially thousands of member ways.

Instead, we provide a way to modify the child relation via a new SetTag method:

function relation_postscan_function(child)
    print("In relation_postscan_function for "..child:Id())
    print("child network is "..child:Find("network"))
    while true do
        local parent,role = child:NextRelation()
        if not parent then break end
        local parent_ref = child:FindInRelation("ref")
        print(" - in "..parent.." "..parent_ref.."/"..role)
        child:SetTag("child_of", parent_ref)
    end
end

Then, when way_function comes to examine the relations that the way is a member of, it'll be able to do things based on the tags we've set (in this case, child_of).

Deeply nested relations

The OSM data model allows multiple levels of nesting, e.g. way->relation->relation->relation.

For relation_postscan_function we flatten out all these levels. The relations_for_relation_with_parents method returns a relationList-style list of all the parent/grandparent/great-grandparent/etc. relations. The original hierarchy is not recorded. We do trap circular references (these exist!).

(We could potentially use relations_for_relation_with_parents in way_function and relation_function, but it means more copying and would be a performance hit. If people want it then we could perhaps offer it as a config option, or provide a method callable from Lua to invoke it.)

Implementation issues

Scanned relation tags are currently stored in a map of std::strings. Other entity tags are stored in a map of protozero views. This means OsmLuaProcessing::Holds and ::Find have to check a boolean before working out which one to interrogate, which is a bit ugly. Perhaps we could move scanned relation tags to use protozero views, at which point we can just set currentTags to that and be done with it.

The post-scan code is currently single-threaded - the number of nested relations is probably so low that multi-threading would only save seconds on a full planet - but that could of course be changed.

I've added a HasTags Lua-accessible method as a quick way of testing whether a relation has tags (this is useful in Geofabrik extracts which ship empty relations outside the area).

@cldellow This will of course interact with #626 - I don't mind tidying it up to work with that (assuming SetTag is still doable...) but thought I'd get a proof-of-concept working with the current codebase first.

cldellow commented 7 months ago

Nice! That seems like a pragmatic way to fully expose relations.

Is there ever a case where a relation is a member of multiple relations, and you'd like to bounce down a tag that ought to have cardinality many?

Contrived example: if hiking route A was a member of two other relations, hiking route B and hiking route C. Naively putting a single child_of on A won't work, as C will stomp on B's SetTag call.

Not the end of the world, as a user could always work around it (for example, by encoding multiple items in child_of, or adopting a scheme like child_of_1, child_of_2, and so on.)

This will of course interact with https://github.com/systemed/tilemaker/pull/626 - I don't mind tidying it up to work with that (assuming SetTag is still doable...)

I think the part of #626 that you're thinking of is that some processing is moved to happen in the kaguya layer, and the OsmLuaProcessing function becomes a shell of what it was, essentially just returning the result that was computed in the kaguya layer. I think extending this PR to that one should be relatively straightforward. Happy to integrate this into the PR once it's merged, or feel free to grab the commits and modify them yourself, just let me know.

systemed commented 7 months ago

Is there ever a case where a relation is a member of multiple relations, and you'd like to bounce down a tag that ought to have cardinality many?

Absolutely, yes. For example, a regional cycle route (relation A) might be a member of a national cycle route (relation B) but also the country part (relation C) of an international cycle relation (relation D) - i.e. the inheritance chains are B->A and D->C->A. In this case, I'd want to show the route polyline with international cycle route styling, but render shields for both the national and the international routes. (Possibly not the regional one.)

So the expectation here is that the user will write some Lua code to do whatever they need - in my case I'd iterate through all the parent relations, gather up the refs and choose the highest network. But other people will have different needs.

Happy to integrate this into the PR once it's merged, or feel free to grab the commits and modify them yourself, just let me know.

Thanks! #626 is now in the v3 branch. If you have a spare hour to merge this into v3 then go for it, but don't worry if not and I can take a stab at it.

cldellow commented 7 months ago

I tried to merge this into v3 in https://github.com/cldellow/tilemaker/tree/v3-post-relation-scan

When I went to test it, I realized perhaps I don't understand how it's meant to work. :)

I was imagining the usage is:

  1. write a relation_scan_function that accepts the relations
  2. write a relation_postscan_function that propagates tags using SetTag
  3. write a relation_function that emits a geometry and some tags that were set in (2)

But I think as of this PR, setRelation's Find only consults the currentTags which have been freshly read from the PBF -- so any updates from SetTag will not be visible.

I figure either I misunderstood how it's meant to work, or I misunderstood that the last piece of integrating it into Find wasn't done yet (since they'd conflict with/have to be rewritten for #626).

Just wanted to check before I went down the wrong path - can you confirm my understanding?

systemed commented 7 months ago

That's a really good point...!

I'd been working with relations on ways, so way_function was reading the amended (by SetTag) tags from osmStore.scannedRelations.relationTags. So for that purpose it works fine.

But you're right, for relation_function, we read the tags in afresh with setRelation. I guess we should look to see if relationTags contains an entry for this particular relation ID, and if so, use that rather than the tags read from the PBF.

cldellow commented 7 months ago

Oh, of course. You'd think the several mentions of way_function in the description, including this one:

Then, when way_function comes to examine the relations that the way is a member of, it'll be able to do things based on the tags we've set

would have caused me to clue in, eh? Ah, well. Happily, that usage does seem to work in that branch.

The penny is slowly dropping over here - I guess ensuring that Find in relation_function shows the updated tags is maybe much less important? They're already able to read their parent relation via FindInRelation, so it'd just be their grandparent (and deeper ancestors) that they'd not have access to.

Given that, I haven't tried to tackle relation_function. I've tidied up my merge a bit and opened https://github.com/systemed/tilemaker/pull/642

systemed commented 7 months ago

Works well - I've just tested it with bouncing down a handful of nested cycle route relations and it's doing what it should. Thank you!

For relation_function - the easy change would be for setRelation to use relations_for_relation_with_parents rather than relations_for_relation, which would get the grandparents etc. But that's a bit messy in that it introduces inconsistent handling for area ways vs multipolygon relations.

The more consistent change would be to set currentTags from osmStore.scannedRelations.relationTags if it exists. But we have a slight fly in the ointment in that relationTags is a map of <std::string,std::string> while currentTags expects <protozero::data_view,protozero::data_view>. Not sure how easy that is to fix.

cldellow commented 7 months ago

ah, good call. I pushed a change that I think achieves that - it creates a temporary TagMap on the fly from the string map if present.

systemed commented 7 months ago

Merged into v3.