BuildingSync / BuildingSync-website

5 stars 0 forks source link

line numbers in error report not helpful #49

Closed macintoshpie closed 4 years ago

macintoshpie commented 4 years ago

In Schematron nokogiri, if a context matches multiple elements, it uses the line number of the first element for any assertions that fail. This is obviously not helpful as it might be the last element that fails, but it gives the line number of the first one. In addition, it's not able to provide line numbers when using abstract patterns (which is what L000 use case uses).

If we instead use lxml's Schematron validation, we can get the correct line number. Here's some code showing how it would work: ⚠️ NOTE: lxml only supports xslt 1, so if xslt 2 is a requirement then this is not an option ⚠️

from lxml import isoschematron, etree

# create the schematron object
sch = 'L000_OpenStudio_Simulation.sch'
sct_doc = etree.parse(sch)
schematron = isoschematron.Schematron(sct_doc, store_report=True)

# load the file to test
source = "Example_A_Valid_Schema.xml"
doc = etree.parse(source)

# run the validation
schematron.validate(doc)

# process the report
NAMESPACES = {
    'svrl': 'http://purl.oclc.org/dsdl/svrl'
}
failed_assert_xpath = '/svrl:schematron-output/svrl:failed-assert'
failed_asserts = schematron.validation_report.xpath(failed_assert_xpath, namespaces=NAMESPACES)
for fa in failed_asserts:
    location = fa.get('location')
    elem = doc.xpath(location)[0]
    # have to strip the auc namespace if we want it to look nice
    tag = elem.tag.replace("{http://buildingsync.net/schemas/bedes-auc/2019}", "")
    print(f'line {elem.sourceline}: element {tag}: {fa[0].text}')

I tested this with the NY use case and it worked, I also tested it with the L000 use case and it worked as well. EDIT: originally I wrote that I had to remove the phase element, but it turned out that it was just in the wrong spot according to the relax ng definition lxml was using. After moving the phase element after the ns element, and before the include elements, it was considered a valid schematron file. EDIT: you must change queryBinding to xslt1 !!

Here's the output for test the Example - A Valid Schema.xml from the bsync site and the L000 use case

line 28: element Building: 
        [ERROR] element 'auc:BuildingClassification' is REQUIRED EXACTLY ONCE for: 'auc:Building'

line 28: element Building: 
        [ERROR] element 'auc:OccupancyClassification' is REQUIRED EXACTLY ONCE for: 'auc:Building'

line 5: element Site: 
        [WARNING] elements 'auc:City' and 'auc:State' or element 'auc:ClimateZoneType//auc:ClimateZone' are RECOMMENDED EXACTLY ONCE at either the 'auc:Site' or 'auc:Building' level.

line 5: element Site: 
        [INFO] Number of 'auc:City' elements defined at the 'auc:Site' = 1, 'auc:Building' = 1
line 5: element Site: 
        [INFO] Number of 'auc:State' elements defined at the 'auc:Site' = 1, 'auc:Building' = 1
line 5: element Site: 
        [INFO] Number of 'auc:ClimateZoneType//auc:ClimateZone' elements defined at the 'auc:Site' = 0, 'auc:Building' = 0
nllong commented 4 years ago

@macintoshpie -- has this been resolved using lxml? If so, please close.