Open ThomasG77 opened 2 years ago
Solved with some GDAL post-processing for communes https://twitter.com/datagouvfr/status/1521067883022979072 instead of touching parser.
Same issues for sections but currently unsolved.
It affects the DVF application as there is an empty section https://www.data.gouv.fr/fr/datasets/demandes-de-valeurs-foncieres-geolocalisees/#discussion-627b860b8ac61b099fc46a86 So, it make "Section Cadastrale" dropdown list part no showing the section (as not available) and the display does not show the section also.
Done by comparing current GeoJSON output https://cadastre.data.gouv.fr/data/etalab-cadastre/2022-04-01/geojson/communes/80/80695/ with output from https://cadastre.data.gouv.fr/data/dgfip-pci-vecteur/2022-04-01/edigeo/feuilles/80/80695/ after using GDAL on the THF file within edigeo-806950000D01.tar.bz2 with command ogr2ogr -f GeoJSON section-d.geojson -dialect SQLite -sql "SELECT * FROM SECTION_id" -lco RFC7946=YES E0000D01.THF
Recipe
wget https://cadastre.data.gouv.fr/data/dgfip-pci-vecteur/2022-04-01/edigeo/feuilles/80/80695/edigeo-806950000D01.tar.bz2
unp edigeo-806950000D01.tar.bz2
ogr2ogr -f GeoJSON section-d.geojson -dialect SQLite -sql "SELECT * FROM SECTION_id" -lco RFC7946=YES E0000D01.THF
We are able to find out issues by parsing the output logs of the edigeo-parser with paste <(cut -c1-5 nohup.out) <(cut -c14- nohup.out) |grep SECTION | sort | uniq
We got a feedback about Saint-Just-Luzac (INSEE 17351) where we got the same issue...
nohup.out is a file produced by running in background the following processing https://github.com/etalab/cadastre#extraction-des-donn%C3%A9es-du-pci-vecteur-et-production-des-fichiers-communaux
For fixing sections
To fix issues, I've taken the approach to solve by type of errors. At the moment, for sections, 6 types of errors
For has-exterior-holes
, I've solved it mainly with branch https://github.com/etalab/edigeo-parser/tree/fix-section-reading-1 but it seems some cases are not fixed. Then, I use the cadastre branch to see in production if effective https://github.com/etalab/cadastre/tree/update-pkg
Matching tests cases (number matches with above list number)
# 1
08339000ZV01:Objet_567765(SECTION) => geometry ignored (has-crossing-holes)
15231000ZB01:Objet_1442135(SECTION) => geometry ignored (has-crossing-holes)
# 2
571510002201:Objet_649933(SECTION) => geometry ignored (has-exterior-holes, has-self-intersection)
571510002202:Objet_649933(SECTION) => geometry ignored (has-exterior-holes, has-self-intersection)
# 3
52432111ZK01:Objet_675114(SECTION) => geometry ignored (ring-has-duplicate-vertices, has-self-intersection)
571510002201:Objet_649933(SECTION) => geometry ignored (has-exterior-holes, has-self-intersection)
# 4
52432111ZK01:Objet_675114(SECTION) => geometry ignored (ring-has-duplicate-vertices, has-self-intersection)
577320003301:Objet_1479958(SECTION) => geometry ignored (ring-has-duplicate-vertices, has-self-intersection)
#5
274485100B01:Objet_463563(SECTION) => geometry ignored (The input polygon may not have duplicate vertices (except for the first and last vertex of each ring))
#6
395110000U01:Objet_1314984(SECTION) => geometry ignored (Unable to build valid polygon coordinates)
Related issues with parcelles (parsing issues, hence not visible and not provided in our etalab-cadastre delivery)
Overall issues list (including polygons, labels, linestring)
errors | count | % |
---|---|---|
010100000A01:Objet_2512020(TLINE) => geometry ignored (Too many linked arcs to build a single LineString) |
2429550 | 91,9608666210689 |
06088000OL01:Objet_126251(SUBDFISC) => geometry ignored (Unable to build valid polygon coordinates) |
107689 | 4,07613499024769 |
Impossible de relier la subdivision fiscale à sa parcelle |
84167 | 3,18580406284929 |
ring-has-duplicate-vertices |
15030 | 0,568900341756566 |
has-exterior-holes |
2303 | 0,087170824156046 |
Impossible de relier parcelle et numéro de voie |
1204 | 0,0455725889204861 |
Failed to deintersect polygon: significant secondary polygon |
977 | 0,0369804147635506 |
The input polygon may not have duplicate vertices |
424 | 0,0160488186896064 |
has-crossing-holes |
282 | 0,0106739784680873 |
deintersectPolygon: unexpected error |
180 | 0,00681317774558762 |
Too many linked faces to build a single Polygon |
62 | 0,00234676122348018 |
arc.left.endsWith is not a function |
45 | 0,0017032944363969 |
found non-noded intersection between |
10 | 0,000378509874754868 |
JSTS union has failed: retrying with mapshaper |
8 | 0,000302807899803894 |
Missing required files in EDIGÉO bundle |
8 | 0,000302807899803894 |
Exemple nouveau de problème parcelle ZB 170 recouvrant ZB 170 sur la commune 14191
Some communes are not parsed correctly (as least for the commune polygon). We confirmed it using another parser, GDAL with Edigeo driver.
You can see how to reproduce the issue https://gist.github.com/ThomasG77/f75f50356d50b9e428dc01c076f6574a
Only 49 communes are concerned but we've seen other type of layers are affected like
TLINE
We probably need to combine approach between current parser and GDAL Edigeo Driver as current parser was done to bypass some GDAL limitations https://blog.geo.data.gouv.fr/cadastre-millesime-janvier-2018-nouveautes-perspectives-a657d471a178
Look also at https://github.com/DoFabien/edigeoToGeojson