oasis-open / dita-rng-converter

OASIS TC Open Repository: The DITA RNG Converter provides cross-platform tools for generating DITA-conforming DTD- and XSD-format versions of RELAX NG DITA grammars: document type shells, vocabulary modules, and constraint modules. It makes it as easy as possible to develop and maintain DITA grammars by allowing use of RELAX NG syntax.
https://github.com/oasis-open/dita-rng-converter
Apache License 2.0
7 stars 10 forks source link

[java] Resolved URL is malformed: unknown protocol: urn #2

Closed ToshihikoMakita closed 6 years ago

ToshihikoMakita commented 8 years ago

Tested using new specialization files that can be downloaded from following URL:

https://github.com/AntennaHouse/ah-dita https://github.com/AntennaHouse/ah-dita/tree/master/com.antennahouse.dita.dita13.doctypes

     [java]  + [INFO] Generating .mod and .ent files in directory "D:\My_Documents\Proj\RNGToDTD2\dita-rng-converter\test_out\1.3\dtd"...
     [java]  + [INFO] processModules: Handling module urn:x-antennahouse:dita:rng:characterDomain.rng...
     [java]  + [DEBUG] generate-modules: rngModuleUrl="urn:x-antennahouse:dita:rng:characterDomain.rng"
     [java]  + [DEBUG] generate-modules: resultDir="file:/D:/My_Documents/Proj/RNGToDTD2/dita-rng-converter/test_out/1.3/dtd/dtd"
     [java]  + [DEBUG] processModules: Applying templates in mode entityFile to generate "urn:x-antennahouse:dita:rng:characterDomain.ent"
     [java] Error at xsl:result-document on line 466 of rng2ditadtd.xsl:
     [java]   Resolved URL is malformed: unknown protocol: urn
     [java]   at xsl:apply-templates (file:/D:/My_Documents/Proj/RNGToDTD2/dita-rng-converter/xsl/rng2ditadtd/rng2ditadtd.xsl#296)
     [java]      processing /
     [java] Resolved URL is malformed

BUILD FAILED
D:\My_Documents\Proj\RNGToDTD2\dita-rng-converter\build.xml:220: Java returned: 2

I attached the whole log file.

log_20160417.zip

drmacro commented 8 years ago

The message "unknown protocol: urn" always means that the URN couldn't be resolved, which causes the URI resolver to fall back to the default resolver and that resolver does not know about "urn:" protocol.

This can really only happen because a URI in the source RNG is not mapped in the catalog provided to Saxon. So the first thing to check is the catalog. If you are using Oxygen for development and using the same catalog with Oxygen you should get the same resolution failure there.

ToshihikoMakita commented 8 years ago

Please look at where this error occurs in the stylesheet.

[java] Error at xsl:result-document on line 466 of rng2ditadtd.xsl:

This error occurs xsl:result-document. So catalog file is not concerned. Following is the my debugging result.

[rng2ditadtd.xsl line 466]
        <xsl:message select="concat('xsl:rsult-document/@href=''',$entResultUrl,'''')"/>
        <xsl:result-document href="{$entResultUrl}" format="dtd">
          ...
        </xsl:result-document>

The output result is as follows:

 [java] xsl:rsult-document/@href='urn:x-antennahouse:dita:rng:characterDomain.ent'

In my simple testing xsl:result-document/@href='urn:x-antennahouse:dita:rng:characterDomain.ent' always fails with error message "Resolved URL is malformed: unknown protocol: urn". However xsl:result-document/@href="characterDomain.ent" always succeeds. So isn't it obvious bug of the stylesheet?

drmacro commented 8 years ago

That's very different then. Obviously the transform cannot write to that URN.

So the question is why it's getting that URN as the result URL to write to.

I'll try to look into this as soon as I can.

ToshihikoMakita commented 8 years ago

document-uri(root(.))) of each RNG module seems to be @name part of catalog file. document-uri() does not return actual file URI defined in catalog file as @uri.

drmacro commented 8 years ago

I believe that's because Saxon uses what was specified in the initial document() function as the URI of the resolved document, not whatever the catalog resolver parsed it to.

If so, it means you may need to have an RNG shell that uses direct URI references as the basis for generating the DTDs and XSDs. That's certainly the case for all the TC-defined shells which is probably why I never noticed this issue before.

ToshihikoMakita commented 8 years ago

If so, it means you may need to have an RNG shell that uses direct URI references as the basis for generating the DTDs and XSDs. That's certainly the case for all the TC-defined shells which is probably why I never noticed this issue before.

At last I noticed why OASIS RNG shell files use path in include/@href instead of URN notation. But in this case we can resolve this issue by:

  1. Read [DITA-OT]/catalog-dita.xml directly by stylesheet & make temporary tree like following. ` ... ...

    `

  2. Make xsl:function to get file URI from URN string. This will avoid the "unknown protocol: urn" error.

If this idea can be useful, I will implement it as soon as possible.

ToshihikoMakita commented 8 years ago

If this idea can be useful, I will implement it as soon as possible.

The module proposed is as follows.

[catalog_util.xsl]

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
    xmlns:cat="urn:oasis:names:tc:entity:xmlns:xml:catalog"
    xmlns:catu="urn:catalog-utility"
    exclude-result-prefixes="xs xd"
    version="2.0">

    <xd:doc scope="stylesheet">
        <xd:desc>
            <xd:p><xd:b>Created on:</xd:b> May 5, 2016</xd:p>
            <xd:p><xd:b>Author:</xd:b> toshi</xd:p>
            <xd:p>Catalog file utility</xd:p>
        </xd:desc>
    </xd:doc>

    <xd:doc>
        <xd:desc>DITA-OT Catalog file tree</xd:desc>
    </xd:doc>
    <xsl:variable name="catalogTree" as="document-node()">
        <xsl:document>
            <xsl:call-template name="expandCatalogFile"/>
        </xsl:document>
    </xsl:variable>

    <xd:doc>
        <xd:desc>Expand catalog file into temporary tree</xd:desc>
    </xd:doc>
    <xsl:template name="expandCatalogFile">
        <catalog>
            <xsl:apply-templates select="document($catalogUrl)" mode="MODE_EXPAND_CATALOG">
                <xsl:with-param name="prmCatalogDir" tunnel="yes" select="$catalogUrl"/>
            </xsl:apply-templates>
        </catalog>
    </xsl:template>

    <xsl:template match="/" mode="MODE_EXPAND_CATALOG">
        <xsl:apply-templates select="*" mode="#current"/>
    </xsl:template>

    <xsl:template match="cat:*" mode="MODE_EXPAND_CATALOG"/>

    <xsl:template match="cat:catalog" mode="MODE_EXPAND_CATALOG">
        <xsl:apply-templates select="*" mode="#current"/>
    </xsl:template>

    <xsl:template match="cat:nextCatalog" mode="MODE_EXPAND_CATALOG">
        <xsl:param name="prmCatalogDir" tunnel="yes" as="xs:string"/>
        <xsl:variable name="nextCatalog" as="xs:string" select="string(@catalog)"/>
        <xsl:variable name="fullNextCatalogDir" as="xs:string" select="string(resolve-uri($nextCatalog,$prmCatalogDir))"/>
        <xsl:apply-templates select="document($nextCatalog,.)" mode="#current">
            <xsl:with-param name="prmCatalogDir" tunnel="yes" select="$fullNextCatalogDir"/>
        </xsl:apply-templates>
    </xsl:template>

    <xsl:template match="cat:group" mode="MODE_EXPAND_CATALOG">
        <xsl:apply-templates select="*" mode="#current"/>
    </xsl:template>

    <xsl:template match="cat:uri" mode="MODE_EXPAND_CATALOG">
        <xsl:param name="prmCatalogDir" tunnel="yes" as="xs:string"/>
        <xsl:variable name="name" select="string(@name)"/>
        <xsl:variable name="uri" select="string(@uri)"/>
        <xsl:variable name="fileUri" select="string(resolve-uri($uri,$prmCatalogDir))"/>
        <uri id="{$name}" uri="{$fileUri}"/>
    </xsl:template>

    <xd:doc>
        <xd:desc>Get file URI from URN using catalog file tree</xd:desc>
        <xd:param>$prmUrn: URN</xd:param>
        <xd:return>File URI</xd:return>
    </xd:doc>
    <xsl:function name="catu:getFileUriFromUrn" as="xs:string">
        <xsl:param name="prmUrn" as="xs:string"/>
        <xsl:variable name="fileUri" as="xs:string" select="string(($catalogTree/catalog/uri[string(@id) eq $prmUrn]/@uri)[1])"/>
        <xsl:choose>
            <xsl:when test="$fileUri ne ''">
                <xsl:if test="$debug">
                    <xsl:message> + [DEBUG] URN="<xsl:value-of select="$prmUrn"/>" is mapped to "<xsl:value-of select="$fileUri"/>"</xsl:message>
                </xsl:if>
                <xsl:sequence select="$fileUri"/>
            </xsl:when>
            <xsl:otherwise>
                <xsl:message> + [ERROR] URN="<xsl:value-of select="$prmUrn"/>" is not defined in catalog file.</xsl:message>
                <xsl:sequence select="$prmUrn"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:function>
</xsl:stylesheet>
drmacro commented 8 years ago

I'm taking a stab at implementing this suggested fix.

In the future, better to submit this type of request as a pull request.

ToshihikoMakita commented 8 years ago

In the future, better to submit this type of request as a pull request.

It is already included in the following pull request.

https://github.com/oasis-open/dita-rng-converter/pull/3/files#diff-55da3f0c14d2dba6846c503929f33adb

nakohdo commented 6 years ago

Hi Eliot,

We've tested the converter with our custom parson DITA and faced the same issue as Toshihiko. His pull request fixed things. However, I've seen that you haven't merged the pull request yet due to a merge conflict. How can we help to get this merged to the common code base?

Best regards, Frank

drmacro commented 6 years ago

I should be able to look at this in a couple of days--still playing catch up from being in Germany for two weeks. If you were able to identify the solution to the merge conflict that would help but I'm not expecting it.

nakohdo commented 6 years ago

Hi Eliot,

I've managed to get the conversion script running but needed to make some modifications:

  1. Apply @ToshihikoMakita's pull request.
  2. Change some path settings in the build file as reported in issue #11.

I suppose Toshihiko's build file was forked from an older version or from another branch so that might have caused the merge conflict but I haven't figured out the details yet. Issue #10 might be related.

drmacro commented 6 years ago

I'm working on integrating/implementing catalog-based URI resolution using Makita-san's code as a base.

drmacro commented 6 years ago

Turns out base-uri() returns the system URL for documents opened via document(), while document-uri() returns the URI passed into document(), so there should be no need for catalog processing.

For reference, Gerrit Imsieke has also implemented an XSLT catalog resolver, although it is not a complete implementation:

https://github.com/transpect/xslt-util/tree/master/xslt-based-catalog-resolver

drmacro commented 6 years ago

Handling of module system IDs fixed by replacing use of document-uri() with base-uri() and adding @xml:base to intermediate modules so base-uri() always returns a good value.

With this fix, the ah-dita DTDs generate correctly.

Updated the code to only output the entity filename in external parameter entity declarations when "use public IDs in shell" is true.

Updated code is pushed to develop branch.