Closed kallemooo closed 3 months ago
I did some performance checks on a very large project containing around 700k lines of ARXML. Using traditional xml.etree it takes around 6.8 seconds to parse versus 8.2 seconds using lxml.etree. This is strange since lxml documentation is claiming to be very fast. Further investigation is needed here.
我也建议改成lxml,因为我在修复一些bug的时候它具有xpath的查找功能
The current parser uses the lightweight standard API xml.etree.ElementTree that support the current needs of the library.
One problem with the current parser is that on an ARXML error there is no feedback on where in the input files the problem is found.
Using the etree module from lxml instead more accurate error information can be printed as the lxml keeps track of xml element file location.
Changing line 1 in base.py to from lxml import etree as ElementTree its all that is needed to use lxml. The API is compatible and I tested with the current unit tests and all test passes.
lxml do also support XML schema and schema validation. With lxml as base schema validation can be added.