osm-fr / osmose-backend

Part of osmose that runs the analysis, and send the results to the frontend.
GNU General Public License v3.0
87 stars 109 forks source link

osmosis_relation_multipolygon fails on frontend on duplicated members #2090

Open Famlam opened 7 months ago

Famlam commented 7 months ago

From https://buildbot.osmose.openstreetmap.fr/#/builders/65/builds/1189/steps/3/logs/stdio :

2023-11-29 03:09:24 france_provence_alpes_cote_d_azur_alpes_maritimes : osmosis_relation_multipolygon
2023-11-29 03:09:25   run osmosis all analyser Analyser_Osmosis_Relation_Multipolygon
2023-11-29 03:09:25   analyser_osmosis_relation_multipolygon.py:241 xml generation
2023-11-29 03:09:25   analyser_osmosis_relation_multipolygon.py:244 sql
2023-11-29 03:09:25   analyser_osmosis_relation_multipolygon.py:245 sql
2023-11-29 03:09:25   analyser_osmosis_relation_multipolygon.py:246 xml generation
2023-11-29 03:09:25   analyser_osmosis_relation_multipolygon.py:247 xml generation
2023-11-29 03:09:25   analyser_osmosis_relation_multipolygon.py:248 xml generation
2023-11-29 03:09:27   update
2023-11-29 03:09:27     iteration=1
2023-11-29 03:09:27       error: UPDATE ERROR france_provence_alpes_cote_d_azur_alpes_maritimes/Osmosis_Relation_Multipolygon : {"detail":"Traceback (most recent call last):\n  File \"/data/project/osmose/frontend/control/update.py\", line 77, in user\n    await update_utils.update(db, source_id, f.name, remote_ip=remote_ip)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 90, in update\n    tasks[1].result()\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 72, in async_parser_task\n    await async_update_parser(source_id, fname, remote_ip, db).parse(q)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 495, in parse\n    await self.endElement(*args)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 616, in endElement\n    await table_merge_markers_tmp(self._db, self.all_uuid)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 405, in table_merge_markers_tmp\n    await _db.execute(sql_marker)\n  File \"/data/project/osmose/frontend/osmose-frontend-venv/lib/python3.9/site-packages/asyncpg/connection.py\", line 317, in execute\n    return await self._protocol.query(query, timeout)\n  File \"asyncpg/protocol/protocol.pyx\", line 338, in query\nasyncpg.exceptions.CardinalityViolationError: ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values."}
2023-11-29 03:09:42     iteration=2
2023-11-29 03:09:42       error: UPDATE ERROR france_provence_alpes_cote_d_azur_alpes_maritimes/Osmosis_Relation_Multipolygon : {"detail":"Traceback (most recent call last):\n  File \"/data/project/osmose/frontend/control/update.py\", line 77, in user\n    await update_utils.update(db, source_id, f.name, remote_ip=remote_ip)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 90, in update\n    tasks[1].result()\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 72, in async_parser_task\n    await async_update_parser(source_id, fname, remote_ip, db).parse(q)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 495, in parse\n    await self.endElement(*args)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 616, in endElement\n    await table_merge_markers_tmp(self._db, self.all_uuid)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 405, in table_merge_markers_tmp\n    await _db.execute(sql_marker)\n  File \"/data/project/osmose/frontend/osmose-frontend-venv/lib/python3.9/site-packages/asyncpg/connection.py\", line 317, in execute\n    return await self._protocol.query(query, timeout)\n  File \"asyncpg/protocol/protocol.pyx\", line 338, in query\nasyncpg.exceptions.CardinalityViolationError: ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values."}
2023-11-29 03:10:12     iteration=3
2023-11-29 03:10:12       error: UPDATE ERROR france_provence_alpes_cote_d_azur_alpes_maritimes/Osmosis_Relation_Multipolygon : {"detail":"Traceback (most recent call last):\n  File \"/data/project/osmose/frontend/control/update.py\", line 77, in user\n    await update_utils.update(db, source_id, f.name, remote_ip=remote_ip)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 90, in update\n    tasks[1].result()\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 72, in async_parser_task\n    await async_update_parser(source_id, fname, remote_ip, db).parse(q)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 495, in parse\n    await self.endElement(*args)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 616, in endElement\n    await table_merge_markers_tmp(self._db, self.all_uuid)\n  File \"/data/project/osmose/frontend/control/update_utils.py\", line 405, in table_merge_markers_tmp\n    await _db.execute(sql_marker)\n  File \"/data/project/osmose/frontend/osmose-frontend-venv/lib/python3.9/site-packages/asyncpg/connection.py\", line 317, in execute\n    return await self._protocol.query(query, timeout)\n  File \"asyncpg/protocol/protocol.pyx\", line 338, in query\nasyncpg.exceptions.CardinalityViolationError: ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values."}

Caused by these two entries being present three times each:

<error class="2" subclass="1085940701017941918">
<relation id="9104912" version="6" user="baptistemm">
<tag k="name" v="Villantipolis 473" />
<tag k="type" v="site" />
<tag k="building" v="yes" />
<tag k="addr:street" v="Route des Dolines" />
<tag k="addr:housenumber" v="473" />
</relation>
<way id="1204376479" version="1" user="baptistemm">
<tag k="source" v="cadastre-dgi-fr source : Direction Générale des Impôts - Cadastre. Mise à jour : 2009" />
<tag k="building" v="commercial" />
</way>
<location lat="43.6221999775615" lon="7.038363911435729" />
<text lang="en" value="building=(yes,commercial)" />
</error>

<error class="2" subclass="1085940701017941918">
<relation id="9104912" version="6" user="baptistemm">
<tag k="name" v="Villantipolis 473" />
<tag k="type" v="site" />
<tag k="building" v="yes" />
<tag k="addr:street" v="Route des Dolines" />
<tag k="addr:housenumber" v="473" />
</relation>
<way id="42366087" version="6" user="baptistemm">
<tag k="source" v="cadastre-dgi-fr source : Direction Générale des Impôts - Cadastre. Mise à jour : 2009" />
<tag k="building" v="commercial" />
</way>
<location lat="43.62221440277423" lon="7.038245860191274" />
<text lang="en" value="building=(yes,commercial)" />
</error>

Reason: these ways are in the multipolygon relation 3 times each.

Two things to fix/add:

  1. Probably we should add DISTINCT to this analyser (it seems to be the only one failing here). The other analyser (sql30) uses GROUP BY and is thus not affected. But we can also see if it works if we just use the multipolygons table here (sql20 and sql30) and use its validity checks.
  2. ~It seems there's no analyser yet which detects duplicated members in multipolygons. Maybe good to add one?~ EDIT: done, see #2092
frodrigo commented 7 months ago

Class 2 and 3 of this analyzer look outdated to me. We cannot use any more member nature to define the multipolygone nature. So I think now, whatever the nature of the member it is now correct, isn't?

Famlam commented 7 months ago

The results look valid to me though. (At least sql20).

If the multipolygon relation has 'building= appartments' and the outer way member has 'building=shed', they conflict, because then the multipolygon is simultaneously a shed and an appartement, which isn't possible.

(The inner ways can of course be anything)

Maybe I misunderstand what you mean?

frodrigo commented 7 months ago

Historically it was possible to define the nature of a MP with tags on outer, or tags on the relation.

At some point, the usage of the outer tags was deprecated, and massively fixed in the data to use tags on relations. Now only the relation tags matters.

If the multipolygon relation has 'building= appartments' and the outer way member has 'building=shed', they conflict, because then the multipolygon is simultaneously a shed and an appartement, which isn't possible.

To be verified, but I think currently having a building tag on the outer is an issue, I think it means the outer itself is a building, in addition with the MP.

Famlam commented 7 months ago

Ah I see what you mean. Yes, the outer way should be left untagged (excluding perhaps things like source, fixme, note, ... as these may reflect a need to change that specific outer geometry).

Famlam commented 7 months ago

... but I'm not sure what to do in case you (for example) have the multipolygon with landuse=grass and the outer way with barrier=fence. What do you think of such cases? Technically the outer way is a separate object. Maybe better to keep the warnings on the safe side and only warn in case there's tag overlap?

p.s. I've added a PR to find the cases where relation members are duplicated.

Famlam commented 7 months ago

There's actually a different flaw in sql20 too... the relation that causes the error is a site relation, but the AND/OR do not have parenthesis, so the type=multipolygon-filter doesn't work.

An example that is actually a true multipolygon causing the same error is https://www.openstreetmap.org/relation/16118298 (from Kosovo)