Closed ronaldtse closed 6 years ago
Once I fixed the "[en] language missing" for T&D sections, most of these problems are gone. We need to ensure that these parts don't go missing even when T&D content is broken?
Remaining issues:
* 为 ``真``表示"``验证通过``",为 ``假`` 表示"``验证不通过``"。
is rendered as 为 真 表示"#x9A8C;#x8BC1;#x901A;#x8FC7; ",为 假 表示"#x9A8C;#x8BC1;#x4E0D;#x901A;#x8FC7; "。
So, document was breaking when English language was missing in T&D; that's surprising.
*(本稿完成日期:2018年1月): boilerplate found in the original document that I'm now stripping out. I will put it back in if it will be anchored to created-date in bibdata (or, more to the point, last updated-date)
Error in your markup, and it's not being caught in validation properly (because RNC text
can be empty): What you have marked up as the title-intro is the title-main. The title-main is mandatory, the title-intro is optional. A one-phrase title is supposed to use title-main, not title-intro.
https://github.com/riboseinc/asciidoctor-iso/issues/106 to ensure empty strings such as title-main are not generated.
the line * 为
真
表示"验证通过
",为假
表示"验证不通过
"。 is rendered as 为 真 表示"#x9A8C;#x8BC1;#x901A;#x8FC7; ",为 假 表示"#x9A8C;#x8BC1;#x4E0D;#x901A;#x8FC7; "。
There are ` that have made it into the wild in your document unescaped, and Html2Word is assuming them to be AsciiMath delimiters; it's therefore attempting to render the text as OOML maths, and getting it wrong. https://github.com/riboseinc/isodoc/issues/31 to generate correct delimiters.
6.2: problem character is Unicode ellipse 6.3: problem character is Unicode en-dash (did you mean minus?)
Will attempt decoding Unicode entities before passing them into AsciiMath processors.
Decoding the Unicode entities addresses 6.2, 6.3. 6.4 with the box is still open.
The box is somehow the MathML translation to Word (using Word's own stylesheet) being mangled. Cutting and pasting the MathML into Word does the same mangling, whereas the MathML rendered online is fine. If you go to Linear in the equation in Word, and retype it, Word inserts a space between the Sum and the thing being summed, and it works out. But you have to retype it: just deleting the space blows the equation up.
The dotted square is Word complaining that it's missing a parameter; I've compared what Word expects as OOXML and the OOXML it's got, and what's missing is m:sSupPr, which is a wrapper for superscript properties (because somehow the 2 is an m:sup). I don't see how it's a parameter, and how editing the AsciiMath or MathML would force that m:sSupPr to be supplied.
I've spent an hour on this, and I don't see a clean way forward. For non-trivial AsciiMath and MathML, you may have to edit the Word document: the translation from MathML to Word is imperfect; and given that you can copy paste MathML into Word, this has to be a known issue, that we aren't going to be the ones to solve.
I've fixed as much of this as I can:
frontpage: the "draft" message about "2018" shouldn't be shown
Removed. If you want it put back in for known created date, let me know.
Title should not have "SM2密码算法使用规范-" ending dash
Changed markup: moved title-intro to title-main. Will change XML generation so it does not generate empty elements, and the validation complains about the missing element.
the line * 为
真
表示"验证通过
",为假
表示"验证不通过
"。 is rendered as 为 真 表示"#x9A8C;#x8BC1;#x901A;#x8FC7; ",为 假 表示"#x9A8C;#x8BC1;#x4E0D;#x901A;#x8FC7; "。
Fixed by changing the Asciimath delimiter.
7.2, 7.3, 7.4 Math equations shows a Unicode character number within. In 7.4, a space in between is rendered as a box.
Fixed for 7.2, 7.3, by decoding Unicode escapes in Asciimath. Can't fix 7.4: there is something broken about how Word translates sums into OOXML, and we will have to warn users about it.
Thank you @opoudjis for the markup clarification and fixes -- I agree that the OOXML Math issue should just be considered unfixable for now. I wonder if we should could file this bug in the html2doc
repo so one day someone could work on it.
I couldn't leave well enough alone :-( . I've found a fix: a Sum (munderover
) followed by an Exponential (msup
) in MathML need to wrap the exponential in mrow
, for Word not to complain. I don't know why, but I'll put the fix in.
Wow, it fees like AsciiMath to MathML should be a separate gem! :wink:
I've updated https://github.com/riboseinc/gmt-0009-2012 to show the original text of GM/T 0009-2012, the point is to recreate it in the GM standard format.
Currently there are minor issues like:
* 为 ``真``表示"``验证通过``",为 ``假`` 表示"``验证不通过``"。
is rendered as为 真 表示"#x9A8C;#x8BC1;#x901A;#x8FC7; ",为 假 表示"#x9A8C;#x8BC1;#x4E0D;#x901A;#x8FC7; "。