Open polish96 opened 12 months ago
Could you give example data? It's not fully clear to me what you mean
Could you give example data? It's not fully clear to me what you mean
For example: https://www.openstreetmap.org/node/3987932537/ is a duplicate of https://www.openstreetmap.org/way/315587859/ and https://www.openstreetmap.org/node/3987932526 is a duplicate of https://www.openstreetmap.org/node/3987932530.
The algorithm should check:
If all the above conditions are met, the system should signal that the address is likely duplicated and needs to be checked and corrected.
Ok, I understand. Addresses are always difficult, as for example two different shops/offices may exist with the same address (and buildings can contain multiple addresses). So if we implement it, we should probably limit it to "bare" addresses only, e.g. no other tags besides addr:*
tags (and possibly selected tags like source
etc).
I'm unfortunately not too familiar with all the different address-tagging conventions worldwide...
Whether both elements are within a distance of no more than 5 km from each other
It's not impossible that two cities with the same street name have that street close to the shared city border, so we'd have to limit it to e.g. nodes within building perimeters or so, maybe with a buffer of a couple of meters, definitely not km ;) . (Also for performance reasons)
So if we implement it, we should probably limit it to "bare" addresses only, e.g. no other tags besides
addr:*
tags (and possibly selected tags likesource
etc).
I share the same opinion, I think we should limit it to just the addresses without additional tags.
Note, there is already multiple checks here https://github.com/osm-fr/osmose-backend/blob/dev/analysers/analyser_osmosis_relation_associatedStreet.py#L615
Note, there is already multiple checks here https://github.com/osm-fr/osmose-backend/blob/dev/analysers/analyser_osmosis_relation_associatedStreet.py#L615
If you mean 'item 2060 - street numbers,' I am aware that it exists. Unfortunately, I have searched through all 'class' elements belonging to 'item 2060,' and I couldn't find a tool that searches for duplicated addresses.
I would like to propose an enhancement to "item 1010 - duplicated node" check. Currently, this check does not effectively identify duplicated addresses, where two nodes or lines share the same building number, street name, and locality name.
The current implementation of "item 1010 - duplicated node" fails to accurately detect duplicate addresses, hindering the tool's ability to identify nodes or lines with identical building numbers, street names, and locality names.
I suggest implementing a more robust algorithm for the "item 1010 - duplicated node" check that considers additional criteria, such as building numbers, street names, and locality names. This enhancement will enable Osmose to accurately identify and flag nodes or lines with duplicated addresses, providing a more comprehensive and valuable verification tool for the OSM community.
In the event that the proposed enhancement cannot be seamlessly integrated into the existing "item 1010 - duplicated node" check, an alternative solution could involve creating a new validation rule specifically designed to address the identified issue. This new rule could be assigned a distinct item number for clear reference and tracking purposes.