NeTEx-CEN / NeTEx

NeTEx is a CEN Technical Standard for exchanging Public Transport schedules and related data.
http://netex-cen.eu
GNU General Public License v3.0
79 stars 39 forks source link

Schema XSD validator #213

Closed agougoua closed 3 years ago

agougoua commented 3 years ago

Hi,

I don't know if it's the best way to ask my question : do you know if there is a NeTEx XSD schema validator ? Thanks in advance for your help.

Best regards,

Alban GOUGOUA Data Scientist/Analyst at French Transport Regulatory Body Email : alban.gougoua@autorite-transports.fr

skinkie commented 3 years ago

@agougoua sure, the most trivial operation would be xmllint --noout --schema /path/to/NeTEx_publication.xsd /path/to/netexfile.xml it even accepts .xml.gz. There is a Python script available, based on lxml which can speed up the validation of EPIP line-based exports and speeds up the processing in multiple ways significantly, and still there is room for optimisation.

agougoua commented 3 years ago

Thanks a lot @skinkie ! We had the same idea in Python. I have thought that there was an other solution like an opensource program developed by your team for that.

Thanks again and have a good day ! Best regards,

Alban GOUGOUA Data Scientist/Analyst at French Transport Regulatory Body Email : alban.gougoua@autorite-transports.fr

skinkie commented 3 years ago

@agougoua on basecamp the prototype in Python can be found. At this moment the biggest bottleneck in libxml2 is the serial processing of key-identity-constraints, that could be parallelized, which is implemented in Python but requires a different trick not to have the regular validator to check it. In addition, if a profile such as the Nordic from Entur uses multiple files, it should be able to check external references. xmllint is likely never capable of doing that for key-identity-constraints.

agougoua commented 3 years ago

Hi @skinkie, I hope you are fine.

If I want to validate a NeTEx ZIP, obviously it consists many files XML, which XSD schema(s) will be appropriated ? In my case, I use the French profile.

skinkie commented 3 years ago

I have not seen a French profile XSD, so you would have to validate each file individually, using the full XSD from this repository. I can recommend to cache the XSD validator object, this will save an enormous overhead.

Aurige commented 3 years ago

Hello, I confirm that the French profile does not have specific XSD as the one done for EPIP (one can build his own of course but for now you need to validate using the full XSD)