OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.89k stars 2.54k forks source link

Cant't open properly formated and validated gml file with GMLAS driver. #2361

Open gpprojekt-marcin opened 4 years ago

gpprojekt-marcin commented 4 years ago

Expected behavior and actual behavior.

Actual behavior

$ xmllint --noout 3217__OT_ADJA_A_Walcz.gml && echo 'good' || echo 'bad'
good
$ ogrinfo -al -so GMLAS:3217__OT_ADJA_A_Walcz.gml
ERROR 1: ./BT_ModelPodstawowy.xsd:398:91 type 'urn:gugik:specyfikacje:gmlas:mapaZasadnicza:1.0:MZ_OgolnyObiektPropertyType' not found. You may retry with the HANDLE_MULTIPLE_IMPORTS=YES open option
$ ogrinfo -al -so -oo HANDLE_MULTIPLE_IMPORTS=YES GMLAS:3217__OT_ADJA_A_Walcz.gml
ERROR 1: MZ_MapaZasadnicza.xsd:6:105 global element 'MZ_OgolnyObiekt' declared more than once
ERROR 1: MZ_MapaZasadnicza.xsd:7:42 global type 'complexType:MZ_OgolnyObiektType' declared more than once or also declared as simpleType
ERROR 1: MZ_MapaZasadnicza.xsd:21:50 global type 'complexType:MZ_OgolnyObiektPropertyType' declared more than once or also declared as simpleType

Expected

$ ogrinfo -al -so 3217__OT_ADJA_A_Walcz.gml
INFO: Open of `3217__OT_ADJA_A_Walcz.gml'
      using driver `GML' successful.

Layer name: OT_ADJA_A
Geometry: MultiPolygon
Feature Count: 1
Layer SRS WKT:
PROJCRS["ETRS89 / Poland CS92",
    BASEGEOGCRS["ETRS89",
        DATUM["European Terrestrial Reference System 1989",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4258]],
    CONVERSION["Poland CS92",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",19,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9993,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",500000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",-5300000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (x)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (y)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["unknown"],
        AREA["Poland"],
        BBOX[49,14.14,55.93,24.15]],
    ID["EPSG",2180]]
gml_id: String (0.0) NOT NULL
lokalnyId: String (36.0)
przestrzenNazw: String (20.0)
wersjaId: String (19.0)
czyObiektBDOO: Integer(Boolean) (0.0)
x_kod: String (6.0)
x_skrKarto: String (0.0)
x_katDoklGeom: String (3.0)
x_zrodloDanychG: String (3.0)
x_zrodloDanychA: String (3.0)
x_katIstnienia: String (0.0)
x_rodzajReprGeom: String (2.0)
x_uwagi: Integer (0.0)
x_aktualnoscG: String (10.0)
x_aktualnoscA: String (10.0)
poczatekWersjiObiektu: String (19.0)
x_dataUtworzenia: String (10.0)
x_informDodatkowa: String (1.0)
x_kodKarto10k: String (8.0)
x_kodKarto25k: String (8.0)
x_kodKarto50k: String (8.0)
x_kodKarto100k: String (8.0)
x_kodKarto250k: String (0.0)
x_kodKarto500k: String (0.0)
x_kodKarto1000k: String (0.0)
nazwa: String (6.0)
posList: String (66239.0)
idPRG: Integer (0.0)
idTerytJednostkiNadrzednej: Integer (0.0)
idTerytTerc: Integer (0.0)
rodzaj: String (2.0)
PRG|BT_ReferencjaDoObiektu|idIIP|BT_Identyfikator|lokalnyId: Integer (0.0)
PRG|BT_ReferencjaDoObiektu|idIIP|BT_Identyfikator|przestrzenNazw: String (12.0)

Steps to reproduce the problem.

wget https://gist.github.com/gpprojekt/b44d44f427d41fad91e4951c5539e478/raw/7d4f76e91cf135b89fcdd854ab2e80a24591ce9a/3217__OT_ADJA_A_Walcz.gml
for XSD in BT_ModelPodstawowy.xsd  MZ_MapaZasadnicza.xsd  OT_BDOT10k_BDOO.xsd  OT_BDOT10k_Slowniki.xsd; do
wget "https://gist.githubusercontent.com/gpprojekt/0f45c47bac76395f69d8546c5f6451f2/raw/1a4bd90d2ff1ee0f2bc738d2f669ba1b308a3515/$XSD"
done
ogrinfo -al -so GMLAS:3217__OT_ADJA_A_Walcz.gml

Operating system, GDAL version and provenance

  1. Ubuntu 19.10 GDAL 2.4.2, released 2019/06/28 http://pl.archive.ubuntu.com/ubuntu eoan/universe
  2. Ubuntu 18.04 GDAL 3.1.0dev, released 2020/03/27 Docker image osgeo/gdal:ubuntu-small-latest
rouault commented 4 years ago

I've diagnosed the issue to be a cyclic dependency chain. The MZ namespace imports the BT one, and the BT one imports the MZ one. Not sure if this is allowed per XML specifications, but at the very least it confuses Xerces-C. When I "move" the content of the MZ schema into the BT one with proper replacing of namespaces and removal of import of the MZ schema, the driver can open the file. It is confused by the ot:geometria but that's a different limitation

gpprojekt-marcin commented 4 years ago

@rouault Thank's for fast reply. So what's now? I want to help but my C/C++ skill ends at ./configure && make -j9 && make install. SCMPrint from xerces pkg don't complain about resolving BT_ModelPodstawowy.xsd. SCMPrint BT_ModelPodstawowy.xsd > output.txt

  1. Do you identify this as bug or it is xerces thing or this usecase is too narrow and this issue should be closed?
  2. Is there way I can help?
  3. Do you think there is way to get rid of problem with ot:geometria or maybe it should have its own issue? It is inherited from other complex type via <element name="geometria" type="gml:PolygonType"/>
gpprojekt-marcin commented 4 years ago

I find out that cyclic depencies are used in higher level xsd too, so I think they are not forbidden in XML specs. Image from https://stackoverflow.com/a/8944124 gml.xsd depency graph

rouault commented 4 years ago
  1. Do you identify this as bug or it is xerces thing or this usecase is too narrow and this issue should be closed?

I don't know. Might be a limitation due to the way we use Xerces in the driver. Would require deeper investigation.

  1. Do you think there is way to get rid of problem with ot:geometria or maybe it should have its own issue?

One way would be to modify the schema so that ot:geometria isn't directly of type gml:PolygonType, but rather of type gml:PolygonPropertyType, that is you'll have <ot:geometria><gml:Polygon ....>...</gml:Polygon></ot:geometria>, which is generally what is used in most application schemas of GML, and well handled by the driver.