Open jescriu opened 3 years ago
Thanks for reporting this issue, @jescriu .
I've tried to run the transformation via the GeoDCAT-AP API demo, which uses the PHP implementation, and it works:
It seems that the problem with the proposed Python script is that the etree.parse
function does not support HTTPS.
A possible fix:
import lxml.etree as ET
from urllib2 import urlopen
# The URL of the XML document to be transformed. Here it corresponds to a "GetRecords" output of a fictitious CSW, with the "maxRecords" parameter set to 10.
xmlURL = "http://some.site/csw?request=GetRecords&service=CSW&version=2.0.2&namespace=xmlns%28csw=http://www.opengis.net/cat/csw%29&resultType=results&outputSchema=http://www.isotc211.org/2005/gmd&outputFormat=application/xml&typeNames=csw:Record&elementSetName=full&constraintLanguage=CQL_TEXT&constraint_language_version=1.1.0&maxRecords=10"
# The URL pointing to the latest version of the XSLT.
xslURL = "https://raw.githubusercontent.com/SEMICeu/iso-19139-to-dcat-ap/master/iso-19139-to-dcat-ap.xsl"
xml = ET.parse(urlopen(xmlURL))
xsl = ET.parse(urlopen(xslURL))
transform = ET.XSLT(xsl)
print(ET.tostring(transform(xml), pretty_print=True))
Does this work?
I wrote a Python script using urllib.request
from the Python standard library solving this issue. The script accepts urls and file paths as arguments.
Thanks for contributing the script, @arbakker . I included a link to it in the "How To" page (see commit https://github.com/SEMICeu/iso-19139-to-dcat-ap/commit/41026c285e688f551e92f3b6063a4c7d3bce8997).
@jescriu , I updated the Python script as illustrated in https://github.com/SEMICeu/iso-19139-to-dcat-ap/issues/29#issuecomment-838545291 (see https://github.com/SEMICeu/iso-19139-to-dcat-ap/commit/41026c285e688f551e92f3b6063a4c7d3bce8997).
Is this fix addressing your issue?
About your other question:
The XSLT should always return a correct RDF file, irrespective of the tool you're using.
Other options to test it are the GeoDCAT-AP API I mentioned earlier in this thread, or the command line tool above kindly contributed by @arbakker .
Dear colleagues,
I tried to run iso-19139-to-dcat-ap XSLT via Python (as proposed in this GitHub page), but I got an error message when parsing the input .xml metadata file from a GetRecordById request to my catalogue (https://www.ide.cat/servei/catalunya/cataleg-idec/csw?request=GetRecordById&service=CSW&version=2.0.2&outputSchema=http://www.isotc211.org/2005/gmd&ElementSetName=full&ID=inspire-adreces).
After experiencing this issue, I directly ran the XSLT transformation to my ISO .xml metadata files in Notepad++, using the XML Tools plugin (Plugins > XML Tools > XSL Transformation) - However, I am not sure if this is an appropriate way to run it, and if the .rdf DCAT metadata files obtained in this way are correct.
Anyway, I think running the XSLT script in Notepad++ could be a good idea to spread its use across non-developer users.
Happy to get your feedback.
All the best,
Jordi