metanorma / metanorma-ieee

Metanorma for IEEE SA
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Misrender of `--` em-dash (long dash) into `—-` (long dash followed by short dash) in semantic XML #373

Closed ReesePlews closed 1 month ago

ReesePlews commented 1 month ago

we have some text in our document which is encoded as:

...arbitrarily complex relationships -- including hierarchical, heterarchical, and nested -- is central to its role...

the PDF is rendered as (notice the long dash followed immediately by a small dash (hyphen?):

image

the MS-Word document is rendered as:

image

from reading the metanorma documentation i thought the em-dash (long dash) was encoded as -- dashes image

@opoudjis could you tell me i am misinterpreting the method for creating the em-dash or if there is an issue with the tool. thank you.

ronaldtse commented 1 month ago

I assume this is the PDF? Ping @Intelligent2013

ReesePlews commented 1 month ago

thank you @ronaldtse and @opoudjis, i should have been more clear. the first image was the PDF, i have updated the original issue to now include output from MS-Word. i am not using the HTML output so i am not checking that, sorry.

ronaldtse commented 1 month ago

@ReesePlews is it possible to provide the relevant XML output? That would quickly resolve any question of reproducibility and quicken the fix. Thanks!

ronaldtse commented 1 month ago

The semtnaic XML is:

<p id="_f96eb3d4-1759-7e07-09a5-9aaa3e73f7c0">... navigate arbitrarily complex relationships —- including hierarchical, heterarchical, and nested —- is central to its role...</p>

Notice that the sequence of —- is already present. This is a bug.

ronaldtse commented 1 month ago

Actually this is not a bug. The original encoding is:

... navigate arbitrarily complex relationships --- including hierarchical, heterarchical, and nested --- is central to its role...

There are 3 dashes here. The first 2 dashes became an em-dash and the last one remains.

I checked the "fixed branch" and the encoding is correct:

https://github.com/Spatial-Web-Foundation/SWF-Corpus_and_IEEEP2874-D2/pull/1590

<p id="_a03d7f45-491a-79c8-1bdd-f0c99444863f">... navigate arbitrarily complex relationships — including hierarchical, heterarchical, and nested — is central to its role...</p>
ronaldtse commented 1 month ago

Closing, problem due to incorrect syntax.