GeoNet / help

An issues repo for technical help questions.
6 stars 3 forks source link

Possible invalid QuakeML URIs? #4

Closed cjhopp closed 8 years ago

cjhopp commented 8 years ago

Hello,

Opening an issue for this here to facilitate communication (per Carloine's suggestion).

I’m a new PhD student at Vic working with Martha Savage and John Townend. I’ve just downloaded a few hundred QuakeML files from the GeoNet catalog but they fail to validate against the QuakeML schema. The issue might be with the multiple #s contained in the URI’s (see attached xml document), which, I believe, are not accepted as part of the within the URI?

A section (not all) of the output from xmllint validating the attached file against the QuakeML-1.2.xsd schema is as follows:

chet@chet-Dell-Precision-M3800:~/xml_validation$ xmllint --noout --valid --schema ~/xml_validation/QuakeML-1.2.xsd ~/xml_validation/2013p614135_orig.xml
/home/chet/xml_validation/2013p614135_orig.xml:2: validity error : Validation failed: no DTD found !
="http://quakeml.org/xmlns/quakeml/1.2" xmlns="http://quakeml.org/xmlns/bed/1.2"
                                                                              ^
/home/chet/xml_validation/2013p614135_orig.xml:25: element stationMagnitude: Schemas validity error : Element '{http://quakeml.org/xmlns/bed/1.2}stationMagnitude', attribute 'publicID': 'smi:scs/0.7/NLL.20130816054632.049483.46401#staMag.ML#NZ.AMCZ' is not a valid value of the atomic type '{http://quakeml.org/xmlns/bed/1.2}ResourceReference'.
/home/chet/xml_validation/2013p614135_orig.xml:57: element stationMagnitude: Schemas validity error : Element '{http://quakeml.org/xmlns/bed/1.2}stationMagnitude', attribute 'publicID': 'smi:scs/0.7/NLL.20130816054632.049483.46401#staMag.MLv#NZ.AMCZ' is not a valid value of the atomic type '{http://quakeml.org/xmlns/bed/1.2}ResourceReference'.
/home/chet/xml_validation/2013p614135_orig.xml:90: element stationMagnitude: Schemas validity error : Element '{http://quakeml.org/xmlns/bed/1.2}stationMagnitude', attribute 'publicID': 'smi:scs/0.7/NLL.20130816054632.049483.46401#staMag.ML#NZ.ANWZ' is not a valid value of the atomic type '{http://quakeml.org/xmlns/bed/1.2}ResourceReference'.

Removing all # s from the file results in validation using QuakeML-1.2.xsd.

A separate issue I’ve run into involves a lack of ‘originID’ elements within the ‘stationMagnitude’ element. Using the attached RelaxNG schema used by SeisHub (database structure built on Obspy), jing outputs the following errors:

chet@chet-Dell-Precision-M3800:~/xml_validation$ jing QuakeML-1.2-merged.rng 2013p614135_orig.xml
/home/chet/xml_validation/2013p614135_orig.xml:25:98: error: value of attribute "publicID" is invalid; must be a URI
/home/chet/xml_validation/2013p614135_orig.xml:37:26: error: element "stationMagnitude" incomplete; missing required element "originID"
/home/chet/xml_validation/2013p614135_orig.xml:57:99: error: value of attribute "publicID" is invalid; must be a URI
/home/chet/xml_validation/2013p614135_orig.xml:69:26: error: element "stationMagnitude" incomplete; missing required element "originID"
/home/chet/xml_validation/2013p614135_orig.xml:90:98: error: value of attribute "publicID" is invalid; must be a URI
/home/chet/xml_validation/2013p614135_orig.xml:102:26: error: element "stationMagnitude" incomplete; missing required element "originID"
/home/chet/xml_validation/2013p614135_orig.xml:122:99: error: value of attribute "publicID" is invalid; must be a URI
/home/chet/xml_validation/2013p614135_orig.xml:134:26: error: element "stationMagnitude" incomplete; missing required element "originID"

Does GeoNet validate these files against a different schema, and if so what schema do you use? I’m hoping to make use of these event files using SeisHub as a database while processing data using Obspy tools but cannot do so if they don’t validate properly.

Any feedback as to how to troubleshoot this (what does GeoNet do with these files?) would be most welcome! Of course, if the error is mine, please do say so.

Cheers, Chet Hopp

Note: .txt extensions on the following files are just to fool github into allowing me to attach them:

2013p614135.xml.txt QuakeML-1.2.xsd.txt QuakeML-1.2-merged.rng.txt

gclitheroe commented 8 years ago

I've updated the XSL as per @kfenaugh request, the originID element is there now.

gclitheroe commented 8 years ago

As to the validation error - there is a typo in the file QuakeML-BED-1.2.xsd (which is imported by QuakeML-1.2.xsd)

If you find the line

...
 <xs:restriction base="xs:anyURI">
      <xs:pattern value="(smi|quakeml):[\w\d][\w\d\-\.\*\(\)_~']{2,}/[\w\d\-\.\*\(\)_~'][\w\d\-\.\*\(\)\+\?_~'=,;#/&amp;]*"/>
    </xs:restriction>
...

The end of it looks like /&amp; should be replaced with \& to me. When I do this the QuakeML is valid.

@kfenaugh will let the QuakeML folk know.

Hope this helps.

kfenaugh commented 8 years ago

Email with this recommendation sent through to the folk at quakeml@sed.ethz.ch.

cjhopp commented 8 years ago

This helps immensely! @kfenaugh, thanks for forwarding me the email to the ETH people. Please let me know what comes of it.

kfenaugh commented 8 years ago

Reply from ETH over the weekend:

"Hello Kevin,

I have to look into this, but I'm pretty sure already that you are right. Many thanks for letting us know!

Best regards, Fabian"