metanorma / metanorma-ogc

Metanorma processor for OGC documents
https://www.metanorma.com
BSD 2-Clause "Simplified" License
2 stars 3 forks source link

URL in query string breaks PDF compilation #664

Closed opoudjis closed 5 months ago

opoudjis commented 6 months ago

Arises in compiling document in https://github.com/opengeospatial/ogcapi-maps/pull/132

as reported in https://github.com/opengeospatial/ogcapi-maps/issues/129

dealing with URLs like

https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=%5BEPSG:4326%5D["https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=[EPSG:4326\]"]

with & in the URL string

This was breaking HTML, and I've fixed that in https://github.com/metanorma/metanorma-ogc/issues/650

I think the issue is that the URL content is being rendered as content of the hyperlink if the <link target="..."/> element is empty, and the ampersand is not being XML-escaped...

Intelligent2013 commented 6 months ago

I can't repeat this issue. PDF generated successfully: image (Note: A.7 instead of B.7, because I use annex_examples.adoc only in the test generation.)

I use this https://github.com/opengeospatial/ogcapi-maps/pull/132/commits/12a0c59d90a99e28c761067c006939203346f8e5

git clone https://github.com/opengeospatial/ogcapi-maps
cd ogcapi-maps
git checkout fix_issue_129
cd core\standard 

I've just generated the presentation XML for this parts:

include::clause_0_front_material.adoc[]

include::clause_1_scope.adoc[]

include::annex_examples.adoc[]
Intelligent2013 commented 6 months ago

What is the concrete error do you have? Please attach the error from .err file near where the resulted PDF should be generated.

Intelligent2013 commented 6 months ago

I've uncommented the line in annex_examples.adoc:

// https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=%5BEPSG:4326%5D["https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=[EPSG:4326\]"]

and PDF generated also.

But in the Presentation XML there is an empty <link/> inside link:

<p id="_3e41110b-6561-556a-9bae-a96a0a61cde4"><link target="https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&amp;scale-denominator=20000000&amp;crs=%5BEPSG:4326%5D"><link/></link>”</p>
opoudjis commented 6 months ago

Attention @jerstlouis

But the <link/> in the Presentation XML is clearly a bug on my side.

opoudjis commented 6 months ago

Asciidoc is not that smart, @jerstlouis, and your current HTTP link is killing it.

Asciidoc processes http links by Regex. I know that is dumb and error-prone, but that is how Asciidoc is implemented.

So it can cope with

https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=%5BEPSG:4326%5D

and it can cope with

https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=[EPSG:4326]

But when you give it:

https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=%5BEPSG:4326%5D["https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=[EPSG:4326\]"]

it is using a Regex to resolve, not only the https:// at the start, but also the https:// inside of the https:....[] "macro"; and the results ends up being complete confusion. That confusion ends up turning into a nested <link/> when Metanorma receives the results of Asciidoctor parsing. And the PDF processor, justly, cannot cope with that.

In order to prevent that confusion, you need to escape the internal https: instance:

https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=%5BEPSG:4326%5D["\https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=[EPSG:4326\]"]

That renders sensibly in Presentation XML as:

<p id="_3ba27ef8-64b8-2244-828b-f8fd7970c523"><link target="https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&amp;scale-denominator=20000000&amp;crs=%5BEPSG:4326%5D">https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&amp;scale-denominator=20000000&amp;crs=[EPSG:4326]</link></p>

@Intelligent2013 @jerstlouis Does this address the issue?

Intelligent2013 commented 6 months ago

@opoudjis I've checked:

<p id="_3ba27ef8-64b8-2244-828b-f8fd7970c523"><link target="https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&amp;scale-denominator=20000000&amp;crs=%5BEPSG:4326%5D">https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&amp;scale-denominator=20000000&amp;crs=[EPSG:4326]</link></p>

renders in PDF correctly: image

opoudjis commented 6 months ago

OK, closing.

opoudjis commented 5 months ago

alert editors

jerstlouis commented 5 months ago

So I've verified this with the 3 links including CRS CURIEs that were causing problems, and this is what works for each of them:

https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=%5BEPSG:4326%5D["\https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=[EPSG:4326\]"]

https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=%5BEPSG:3395%5D["\https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?center=0,51.5&scale-denominator=20000000&crs=[EPSG:3395\]"]

https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?bbox-crs=%5BEPSG:3395%5D&bbox=-4596385.263861,2080129.089271,4596385.263861,11273386.415933&crs=%5BEPSG:3395%5D["\https://maps.gnosis.earth/ogcapi/collections/blueMarble/map?bbox-crs=[EPSG:3395\]&bbox=-4596385.263861,2080129.089271,4596385.263861,11273386.415933&crs=[EPSG:3395\]"]

For the last one, the text is cut-off in the PDF rendering as it doesn't get a line-break, which is somewhat problematic:

cutOffLink

Note that these escaping backslashes render as actual backslashes in the displayed text when using asciidoctor-pdf, so this syntax / behavior is different in metanorma than plain ASCIIDoc -- is that expected?

To summarize just how incredibly difficult it is to encode such links including CURIEs: