metanorma / mn-native-pdf

Development repository for mn2pdf with Metanorma document samples
3 stars 2 forks source link

Editor: Review generation of ISO document #34

Closed ronaldtse closed 3 years ago

ronaldtse commented 5 years ago

Please spot the differences between the source and generated PDF. Both are attached here. Thanks! iso-rice-en.pdf

iso-rice-en.doc.zip

anermina commented 5 years ago
  1. Header has extra "(E)" rendered: image original: image

  2. In Table of Contents, there should be no Appendix 1 Sample code rendered: image original: image

  3. Sections with [preface] attribute shouldn't be numbered rendered: image original: image

  4. Title after preface has an extra "Part 1:", i.e. "Cereals and pulses — Specifications and test methods — Part 1: Rice (in English)" rendered: image original: image

  5. R is capital in the title "Normative references" in the original document (but probably shouldn't be...) rendered: image original: image

  6. Footnote 1 is set before the comma in "ISO 16634:–1,", while it is set after the comma in the original document rendered: image original: image

  7. "ISO DATE:" is missing in the footnote 1 in the generated document rendered: image original: image

  8. Word "modified" is added in generated document in this sentence: "[SOURCE: ISO 7301:2011, 3.2, modified — The term “cargo rice” is shown as deprecated, and Note 1 to entry is not included here]" rendered: image original: image

  9. Font of the notes should be different rendered: image original: image

  10. Missing line break after "\" here: "\ organic and inorganic components other than whole or broken kernels" rendered: image original: image

  11. In Table 1, first two rows should be boldfaced, i.e. terms such as "in husked rice" should be boldfaced rendered: image original: image

  12. Entries are vertically centered in the Table 1 in the rendered document, while they are vertically aligned to the top in the original document (example above)

  13. Equations should be vertically centered rendered: image original: image

  14. Footnote font should be different rendered: image original: image

  15. Heading should be boldfaced in Table D.1 rendered: image original: image

  16. "EXAMPLE — Sample Code" is written in different font in the original document. I think this is properly rendered in generated document, but maybe it's worth noting this is also a difference rendered: image original: image

opoudjis commented 5 years ago

Responses.

  1. The doc identifier is compiled from bits of the XML, and the (E) derives from the source language; we will mitigate that by providing the full identifer in XML in https://github.com/metanorma/metanorma-standoc/issues/162

  2. Suppression of headings in the table of contents is actually idiosyncratic, which is why I have to turn <h2> into p.Annex2 in postporcessing. It's a nasty exception, and I dunno how flexible ToC generation is in xsl:fo, but the rule is to ignore anything but h1 and annex/appendix in appendixes.

  3. The Patent Notice is something we should have gotten rid of and haven't. It's obsolete markup. The Patent Notice currently forces the exception to include section number 0.

  4. Interpolating "part nnn" into titles is something Alexander has missed

  5. Normalising case in titles is an easy miss as well, that's done in code.

  6. Hm. footnote placement. No idea how that happened.

  7. ISO DATE, that may well be a bug my side, can't tell right now. But it should not be present.

  8. Interpolating "modified" is done in code with modifications to term definitions.

  9. Font: don't know how that happened.

  10. The para for termdomain is a gotcha, it's meant to be in the same para as what follows it.

  11. That's about imposing th = bold throughout, applied to everything in thead.

  12. Centering is also an easy miss.

  13. Yes, equations are centered.

  14. Font: don't know how that happened.

  15. Again, thead = bold.

  16. Yes, sourcecode title is not monospace.

ronaldtse commented 5 years ago

@anermina thanks for the review! I noticed that some screenshots have been mixed... I will label the pictures as DOC or PDF for clarity.

(Since @opoudjis posted his comments before me, noticed that the original listing of screenshots may be mis-labeled. See this post for the correct order.)

  • Header has extra "(E)"

DOC:

image

PDF:

image

@Intelligent2013 knows about this, @opoudjis will be providing the full document identifier in https://github.com/metanorma/metanorma-standoc/issues/162

  • In Table of Contents, there should be no Appendix 1 Sample code

DOC:

image

PDF:

image

I think "Appendix 1 Sample code" should show up. @opoudjis will check and clarify.

  • Sections with [preface] attribute shouldn't be numbered

DOC:

image

PDF:

image

The ISO preface clause should be numbered (with 0) IFF it has subsections.

  • Title after preface has an extra "Part 1:", i.e. "Cereals and pulses — Specifications and test methods — Part 1: Rice (in English)"

DOC:

image

PDF:

image

The PDF output should match the DOC output.

  • R is capital in the title "Normative references" in the original document (but probably shouldn't be...)

DOC:

image

PDF:

image

@anermina is correct. The title should be in normal casing ("Normative References" => "Normative references").

  • Footnote 1 is set before the comma in "ISO 16634:–1,", while it is set after the comma in the original document

DOC:

image

PDF:

image

The DOC output is correct here.

  • "ISO DATE:" is missing in the footnote 1 in the generated document

DOC:

image

PDF:

image

The DOC output is correct.

  • Word "modified" is added in generated document in this sentence: "[SOURCE: ISO 7301:2011, 3.2, modified — The term “cargo rice” is shown as deprecated, and Note 1 to entry is not included here]"

DOC:

image

PDF:

image

The DOC output is correct.

  • Font of the notes should be different rendered: image original: image

I don't see the differences?

  • Missing line break after "" here: " organic and inorganic components other than whole or broken kernels"

DOC:

image

PDF:

image

DOC is correct. The "domain" is placed in the immediate front of the definition.

  • In Table 1, first two rows should be boldfaced, i.e. terms such as "in husked rice" should be boldfaced

DOC:

image

PDF:

image

Here's the definitive image from the original ISO Rice document:

Screen Shot 2019-12-03 at 10 52 53 AM

The DOC version: should have the 2nd header row bolded, and the math unbolded. Missing "%" sign in the header row 1 column 2 as line 3. (ping @opoudjis )

The PDF version: mostly correct. Missing "%" sign in the header row 1 column 2 as line 3.

  • Entries are vertically centered in the Table 1 in the rendered document, while they are vertically aligned to the top in the original document (example above)

Notice in the first data row first column the source encoding is wrong -- it should have been:

| Extraneous matter: ...
|
* organic
|...
|
* inorganic
|...

See the definitive image to compare.

  • Equations should be vertically centered

DOC:

image

PDF:

image

DOC is correct here. There should also be an indent for the definition list after "where".

The definitive image from the original ISO Rice:

Screen Shot 2019-12-03 at 10 54 33 AM
  • Footnote font should be different rendered: image original: image

I'm not sure if I see a difference...

  • Heading should be boldfaced in Table D.1

I'm replacing the images here:

Definitive:

Screen Shot 2019-12-03 at 10 56 01 AM

DOC:

Screen Shot 2019-12-03 at 10 55 31 AM

PDF:

Screen Shot 2019-12-03 at 10 56 21 AM

There are some issues in both outputs.

  • "EXAMPLE — Sample Code" is written in different font in the original document. I think this is properly rendered in generated document, but maybe it's worth noting this is also a difference

DOC:

image

PDF:

image

DOC is correct.. The label ("EXAMPLE -- Sample Code") should always be in the body font.

ronaldtse commented 5 years ago

There is another issue with "title-less clauses".

In ISO Rice:

Screen Shot 2019-12-03 at 11 00 17 AM

In DOC:

Screen Shot 2019-12-03 at 11 00 39 AM

In PDF:

Screen Shot 2019-12-03 at 11 00 57 AM

DOC and PDF:

ronaldtse commented 5 years ago

@Intelligent2013 actually, instead of a visual diff with DOC output, let's directly align with the original ISO Rice document: https://github.com/metanorma/mn-samples-iso/blob/master/reference-docs/iso-rice-sample.pdf

Will start a new issue for that.

Intelligent2013 commented 4 years ago

@ronaldtse

  • Header has extra "(E)"

No information how to obtain it.

In https://github.com/metanorma/mn-native-pdf/issues/32: "In the meantime, just do without the (E)."

  • In Table of Contents, there should be no Appendix 1 Sample code

There isn't Appendix 1 in the original ISO Rice document. Last Annex D. In iso-rice-en.xml, Appendix 1 in a subsection of last annex. In the current version of XSLT, Appendix don't show in TOC. If you need to show Appendix as subsection of Annex, then inform me.

  • The ISO preface clause should be numbered (with 0) IFF it has subsections.

Fixed.

  • Title after preface has an extra "Part 1:", i.e. "Cereals and pulses — Specifications and test methods — Part 1: Rice (in English)"

Fixed.

  • R is capital in the title "Normative references" in the original document (but probably shouldn't be...)

In source xml: https://metanorma.github.io/mn-samples-iso/documents/iso-rice-en.xml: <title>Normative References</title>

  • Footnote 1 is set before the comma in "ISO 16634:–1,", while it is set after the comma in the original document

Fixed.

  • "ISO DATE:" is missing in the footnote 1 in the generated document

'ISO DATE:' is a part of note element in XML:

<note format="text/plain"> ISO DATE: Under preparation. (Stage at the time of publication ISO/DIS 16634) </note>

  • Word "modified" is added in generated document in this sentence: "[SOURCE: ISO 7301:2011, 3.2, modified — The term “cargo rice” is shown as deprecated, and Note 1 to entry is not included here]"

Fixed.

  • Font of the notes should be different

I don't see any differences.

  • Missing line break after "" here: " organic and inorganic components other than whole or broken kernels"

Fixed.

  • In Table 1, first two rows should be boldfaced, i.e. terms such as "in husked rice" should be boldfaced

No changes. PDF version is correct (there isn't % in xml data).

  • Equations should be vertically centered

Fixed.

  • There should also be an indent for the definition list after "where".

Fixed.

  • Footnote font should be different

I don't see a differences.

  • The label ("EXAMPLE -- Sample Code") should always be in the body font

Fixed.

  • Should not have a line break between clause number and content.
  • All links to clauses (not figures or tables) should be hyperlinked in BLUE.
  • Footnote should be "2)" not just "2".

Fixed.

  • Notice in the tables, all headers should be "vertically and horizontally centered".

Some cells has a left alignment in source xml:

<th rowspan="2" align="left">Defect</th>
<td align="left">in husked rice</td>
<td rowspan="2" align="left">Description</td>

If you need skip 'align' attribute from xml, please, inform me.

ronaldtse commented 3 years ago

@anermina is this done? If so please close it. Thanks!

anermina commented 3 years ago

Yes, this one should have been closed. Closing now.