My idea for surfacing the provenance of values in the output to the user, and providing clarity and control for the round-trip problem is this: Let the user configure if and when osm2lanes should guess details, instead of annotating every detail with its provenance.
Provides a straight forward way for the user to get what they want
If osm2lanes always guesses and annotates data, the chance that the user needs to process the output to filter out guesses that they don't want is high. If the level of guessing is customisable, it is likely that one of the options is exactly what they want:
Strict Mode for people implementing an editor and round-tripping: describes exactly what the tags describe, warnings for any tag that could not be reconciled, and errors for straight-up conflicts.
Consensus Mode for people rendering an "accurate" map (and probably for the visualisation in an editor too): makes assumptions that mappers expect, such as those documented on the wiki.
Fanciful Mode for people making something pretty or A/B Street: assumes as much as possible, describing a plausible road that is not proven wrong but the tags, and never fails.
Custom Mode could let the user opt into and even configure exactly the guesses they want.
Keeps the schema simple
Tagging every value in every entry in the output will turn the resulting JSON into a wall of text in no time, and I am not looking forward to any part of that experience.
Provides a way to check provenance
The same tags can be converted in different modes, and the user can compare between the two outputs to make more nuanced decisions.
Enables flexibility in implementing guesses
I feel like this will be easier to implement than tracking provenance, especially if we come up with guesses that have nock-on effects in the rest of the parsing process (such as arbitrarily resolving conflicts). There will be no need to keep track of all the values that have been effected by the guess, while the user can still figure out what was assumed.
In case the github reactions are insufficient -- I'm in agreement here. The current schema is slowly getting complex. The common case will have loads of inferred values for width!
My idea for surfacing the provenance of values in the output to the user, and providing clarity and control for the round-trip problem is this: Let the user configure if and when osm2lanes should guess details, instead of annotating every detail with its provenance.
Provides a straight forward way for the user to get what they want
If osm2lanes always guesses and annotates data, the chance that the user needs to process the output to filter out guesses that they don't want is high. If the level of guessing is customisable, it is likely that one of the options is exactly what they want:
Keeps the schema simple
Tagging every value in every entry in the output will turn the resulting JSON into a wall of text in no time, and I am not looking forward to any part of that experience.
Provides a way to check provenance
The same tags can be converted in different modes, and the user can compare between the two outputs to make more nuanced decisions.
Enables flexibility in implementing guesses
I feel like this will be easier to implement than tracking provenance, especially if we come up with guesses that have nock-on effects in the rest of the parsing process (such as arbitrarily resolving conflicts). There will be no need to keep track of all the values that have been effected by the guess, while the user can still figure out what was assumed.