Closed ronaldtse closed 4 years ago
Modeling completed at https://github.com/metanorma/metanorma-model-bipm
Introducing configurable abbreviations of organisations in standoc
Introduce configurable validation of committee values
For internationalisation, need to allow configuration to be hash of languages to files
Genuine internationalisation includes internationalisation of the display values of bibdata, but the original English values need to maintained for logic. Will need to allow translation of bibdata in presentation XML, and extract both original and display values for metadata.
The i18n values (display values) for bibdata are in the local_bibdata element, which duplicates bibdata, but internationalises docstage, docsubstage, and doctype. @Intelligent2013 This is an update to Presentation XML.
The boilerplate also needs to be localised
Markup of the SI brochure needs to address the fact that included documents aren't being recognised as appendixes.
I'll put here a differences between original PDF and source metanorma XML data.
(SI)
addon:
but in XML there isn't (SI)
in title:
<title language="en" format="text/plain" type="main">The International System of Units</title>
<title language="fr" format="text/plain" type="main">Le Système international d’unités</title>
On other pages in PDF also there isn't (SI)
, see cover page and 8th page:
In XML:
<contributor>
<role type="author"/>
<organization>
<name>Bureau International de Poids et Mesures</name>
<abbreviation>BIPM</abbreviation>
</organization>
</contributor>
<contributor>
<role type="publisher"/>
<organization>
<name>Bureau International de Poids et Mesures</name>
<abbreviation>BIPM</abbreviation>
</organization>
</contributor>
There is a difference between caps and 'des' vs. 'de'.
On 5th page in PDF specified version number v1.07: In xml there isn't.
In PDF, there is an indent between table number and table name:
If we need to reproduce it in resulted PDF, then we need to separate them somehow in xml - add <tab/>
or split them in different tags/attributes.
Current source xml:
<name>Tableau 1 — Les sept constantes définissant le SI et les sept unités qu’elles définissent</name>
(6th page in PDF)
(1) Added as /bibdata/title[@type = 'cover']
(2) Fixed
(3) Added to source as /bibdata/version/draft
(4) Fixed
(5) Added to /boilerplate/license-statement
, but genericised to "This document" (the "SI Brochure" title is not in fact the title of the document, and I see little justification to perpetuating idiosyncrasy.)
Appendixes are numbered with arabic numerals, not letters.
@opoudjis thank you.
and after the Content pages there is a section:
but in source XML there sections placed all together in /preface/abstract
. Here is example - the latest p
for section 'before' Contents and the first 'p' for section 'after' Contents:
...
<p id="_8d11cbfb-90ac-499d-b38d-e9462181ac0f">Depuis 1965 la revue internationale <em>Metrologia</em>, éditée sous les auspices du Comité
international des poids et mesures, publie des articles sur la métrologie scientifique,
l’amélioration des méthodes de mesure, les travaux sur les étalons et sur les unités,
ainsi que des rapports concernant les activités, les décisions et les recommandations des
organes de la Convention du Mètre.</p>
<p id="_33d9c195-327b-4de1-96d2-ade72469a110">Depuis son établissement en 1960 par une résolution adoptée par la Conférence générale
des poids et mesures (CGPM) à sa 11<sup>e</sup> réunion, le Système international d’unités (SI) est
utilisé dans le monde entier comme le système préféré d’unités et comme le langage
fondamental de la science, de la technologie, de l’industrie et du commerce.</p>
...
but in xml there isn't such meta data:
<ol id="_3286792b-2ebf-4a60-a9a6-87ae3f960de8" type="arabic">
<li>
<p id="_787bf896-9427-4a4c-abd3-658446f35be4">Les unités photométriques peuvent être définies comme suit:</p>
<dl id="_f275db51-1135-4e51-aac3-7bb4f253be1c">
<dt><strong><em>Bougie nouvelle</em></strong> (unité d’intensité lumineuse).</dt>
<dd>
<p id="_f99e6a8b-8747-4232-846b-0529748b3e86">La grandeur de la bougie nouvelle est telle
que la brillance du radiateur intégral à la température de solidification du platine soit de
60 bougies nouvelles par centimètre carré.</p>
</dd>
<dt><strong><em>Lumen nouveau</em></strong> (unité de flux lumineux).</dt>
<dd>
<p id="_0fd9e1d0-86d7-44e9-ab86-8ff4178fb784">Le lumen nouveau est le flux lumineux émis dans
l’angle solide unité (stéradian) par une source ponctuelle uniforme ayant une intensité
lumineuse de 1 bougie nouvelle.</p>
</dd>
</dl>
</li>
<li>
<p id="_69d002e9-366e-43b6-a14c-ba604baa8513">. . .</p>
</li>
</ol>
therefore in resulted PDF we have:
I see two possible solutions:
Put full list item text in simple p
like this:
<p>4. Les unités photométriques peuvent être définies comme suit :</p>
or
Add an attibute start
for ol
like this:
<ol id="_3286792b-2ebf-4a60-a9a6-87ae3f960de8" type="arabic" start="4">
<li>
<p id="_787bf896-9427-4a4c-abd3-658446f35be4">Les unités photométriques peuvent être définies comme suit:</p>
(6) Fixed markup, differentiated preface as prefatory clause.
(7) This will be a separate ticket: https://github.com/metanorma/metanorma-standoc/issues/349
Not proceeding with (7), changed markup.
metadata_extensions config in YAML needs to permit nested elements:
:comment-period-from:
:comment-period-to:
:comment-period-type:
:reply-to:
:security:
<comment-period type="">
<from></from>
<from></from>
<from></from>
<to></to>
<reply-to></reply-to>
</comment-period>
<security></security>
metadata_extensions:
comment-period:
comment-period-type:
_output: type
_attribute: true
comment-period-from:
_output: from
_list: true
comment-period-to:
_output: to
reply-to:
security:
Container elements may not have hash attributes (attribute = true or list = true or different output names).
Lists are assumed to be in CSV (i.e. quotes override commas)
This is a change from the existing format, where extensions are lists.
Am needing to deal with this in isodoc/metadata, so will output /bibdata/ext/ to a Hash: https://stackoverflow.com/a/10144623
Table des matières de l’annexe 1
. I don't figure out how to display it in PDF. Looks like should be added some additional meta-information into XML.
8.1. The items group by catetegory. 8.2. The title contains only part of title from document body: From table of contents: From document body: 8.3. Source xml contains 2nd and 3rd level section's numbers. Should I ignore it and show 'quad' character instead of 3rd level number? For example, original PDF:
current resulted PDF:
and
**`:
But in the source XML it determined as note
s:
<ol id="_ef59d0ec-5c54-4114-9f96-d0c10c17b50c" type="arabic">
<li>
<p id="_42ac29b0-1e59-4747-acb6-821578b2d679">Le kilogramme est l’unité de masse; il est égal à la masse du prototype international du
kilogramme;</p>
<note id="_eb31e25d-43c8-4d2d-be6a-aa6edef8b32e"><name>NOTE 1</name>
<p id="_56f20fe6-057f-4b2c-8222-d5fa162299d6">Définition abrogée en 2018 par la CGPM
à sa 26<sup>e</sup> réunion (Résolution 1, <em>voir</em> p.92).</p>
</note>
</li>
<li>
<p id="_2b2a73c0-8cdd-448c-a5f0-2afa54914a6d">Le terme poids désigne une grandeur de la même nature qu’une force; le poids d’un corps
est le produit de la masse de ce corps par l’accélération de la pesanteur;
en particulier, le poids normal d’un corps est le produit de la masse de ce corps par
l’accélération normale de la pesanteur;</p>
</li>
<li>
<p id="_32f0948f-9ccf-4165-93c1-809a114a6e98">Le nombre adopté dans le Service international des Poids et Mesures pour la valeur de
l’accélération normale de la pesanteur est <stem type="MathML"><math xmlns="http://www.w3.org/1998/Math/MathML"><mn>980</mn><mi>,</mi><mn>665</mn></math></stem> <stem type="MathML"><math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow><mtext>cm/s</mtext></mrow><mrow><mn>2</mn></mrow></msup></math></stem>, nombre sanctionné déjà par
quelques législations.</p>
<note id="_79d8da3a-e3e1-47e3-8ef7-723ca172159f"><name>NOTE 2</name>
<p id="_cd6025e3-e374-4a14-ba84-218ed5af13e3">Cette valeur de <stem type="MathML"><math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow><mi>g</mi></mrow><mrow><mtext>n</mtext></mrow></msub></math></stem> est la valeur conventionnelle de
référence pour le calcul de l’unité kilogramme-force
maintenant abolie.</p>
</note>
</li>
</ol>
Remarque
:
which coded in XML as note
:
<li>
<p id="_80b89e67-6dc9-425a-96aa-d27928740749">mettre à jour la fréquence de la transition suivante dans la liste des fréquences étalons
recommandées et l’approuver comme représentation secondaire de la seconde:</p>
<ul id="_85f8ccf8-fb70-47f5-bee9-7a6e40b32324">
<li>
<p id="_1ede75c4-875f-409f-9119-3013b050d10c">la transition quantique hyperfine non perturbée de l’état fondamental de l’atome de
<sup>87</sup>Rb, à la fréquence de <stem type="MathML"><math xmlns="http://www.w3.org/1998/Math/MathML"><mn>6</mn><mtext> </mtext><mn>834</mn><mtext> </mtext><mn>682</mn><mtext> </mtext><mn>610.904</mn><mtext> </mtext><mn>312</mn><mtext> Hz</mtext></math></stem> avec une incertitude-type
relative estimée de <stem type="MathML"><math xmlns="http://www.w3.org/1998/Math/MathML"><mn>1</mn><mi>,</mi><mn>3</mn><mo>×</mo><msup><mrow><mn>10</mn></mrow><mrow><mrow><mo>−</mo><mn>15</mn></mrow></mrow></msup></math></stem>.</p>
<note id="_852fee6c-91d1-4f0a-be10-a2263cb7c2e1"><name>NOTE 2</name>
<p id="_193943b0-74b4-47a1-ac59-cb7d5b36999f">La valeur de l’incertitude-type est supposée correspondre à un niveau de confiance
de 68 %. Toutefois, étant donné le nombre très limité de résultats disponibles, il se peut que,
rétrospectivement, cela ne s’avère pas exact.</p>
</note>
</li>
</ul>
</li>
</ul>
In BIPM xslt note
tag is using for page sides notes (at right edge of page). 'Remarque' text should be encoded something else, or we need to determine a rules, when we put note in text, and when at page edge.
(8.1, 8.2) The ToC is clearly not an automatically generated ToC. IMO we should leave it as text, but the page number references will need to be replaced with cross-references in the source markup.
(8.3) The section numbers must be retained for HTML, because page numbers in crossreferences for the HTML are meaningless, and the HTML should cross-reference something. (It could do so without an overt anchor text, but that would involve too much fiddling with the source markup to be reasonable.) For the PDF, therefore, the brochure (and at this stage only the brochure) should indeed ignore subsection numbers in Annexes.
(9, 10) There won't be any rules. In my opinion we are going to have to do a mix of:
How are these "block notes" encoded then?
As normal notes too?
I have sought clarification from BIPM for 8.1/8.2 and 9/10.
From BIPM:
(8.1, 8.2) The ToC is clearly not an automatically generated ToC. IMO we should leave it as text, but the page number references will need to be replaced with cross-references in the source markup.
Let's encode the ToC as normal text.
(9, 10) There won't be any rules. In my opinion we are going to have to do a mix of: How are these "block notes" encoded then?
These notes are "table notes" they apply to the table immediately above, and therefore are not side notes.
Ping @manuel489 to fix the source, and @opoudjis @Intelligent2013 .
8.3. Fixed in xslt:
These notes are "table notes" they apply to the table immediately above, and therefore are not side notes.
In source PDF there is a case, when side notes relates to the table:
We should definite exactly what does mean 'table notes'. In my opinion In terms of XML it means:
...
</tr>
</tbody>
<note .... </note>
</table>
I put all <note>
s as side notes (except preface
section).
When we should put notes as side notes and when 'table notes'?
I don’t remember exactly but there is a differentiation between a note and a table note. Perhaps @opoudjis can answer better.
These notes are "table notes" they apply to the table immediately above, and therefore are not side notes.
@ronaldtse , I believe the source already has the correct markup about the table notes, which is:
...
| bar | stem:["bar"] | stilb | stem:[sf "sb"]
| hour | stem:["h"] | |
|===
NOTE: The symbols whose unit names are preceded by dots are those which had already been adopted by a decision of the CIPM.
NOTE: The symbol for the stere, the unit of volume for firewood, shall be "st" and not "s", which had been previously assigned to it by the CIPM.
NOTE: To indicate a temperature interval or difference, rather than a temperature, the word "degree" in full, or the abbreviation "deg", must be used.
Maybe, the necessary changes that need to be done are in the yaml files.
@manuel489 thanks. @opoudjis are the table notes typically encoded as normal notes?
Yes
Table notes are notes within a <table>
, so they should be being rendered in XML as @Intelligent2013 expects them to be. I will need to debug why they aren't being so treated.
@ronaldtse:
This block note is not a table note nor a side note.
This discussion thread is becoming quickly unmanageable. I am moving discussion of notes (9, 10) to a new ticket: https://github.com/metanorma/metanorma-bipm/issues/15
I have also posted https://github.com/metanorma/metanorma-bipm/issues/16 separately.
(11) The treatment of cross references in the existing HTML and PDF is clearly manually generated and inconsistent, and we should not be seeking to replicate it:
https://www.bipm.org/en/CGPM/db/17/2/
HTML: See Recommendation 1 (CI-2002) of the CIPM on the revision of the practical realization of the definition of the metre. PDF: See Recommendation 1 (CI-2002) of the CIPM on the revision of the practical realization of the definition of the metre, p. 181.
https://www.bipm.org/en/CIPM/db/1984/1/
HTML: The CIPM, in 2002, decided to change the explanation of the quantity dose equivalent in the SI Brochure (Recommendation 2). PDF: * The CIPM, in 2002, decided to change the explanation of the quantity dose equivalent in the SI Brochure (Recommendation 2, see p. 182).
In the first instance, the page reference is appended to the end of the paragraph; in the second, it is inserted with a "see" within the cross-reference. The first instance cannot be generated sensibly from a single Asciidoctor cross-reference, and we should not seek to: markup should be adjusted so as to give sensible results in both PDF and HTML. The needed template for that is to put "see" before any cross-references, allow page numbers to trail after the cross-reference, and have uniform text for all cross-references.
See <<ci-2001-r2>> on the revision of the practical realization of the definition of the metre.
See **Recommendation 2 (CI-2002) of the CIPM** on the revision of the practical realization of the definition of the metre.
See **Recommendation 2 (CI-2002) of the CIPM, p. 181** on the revision of the practical realization of the definition of the metre.
The CIPM, in 2002, decided to change the explanation of the quantity dose equivalent in the SI Brochure (<<ci-2002-r2>>).
The CIPM, in 2002, decided to change the explanation of the quantity dose equivalent in the SI Brochure (**Recommendation 2 (CI-2002) of the CIPM**).
The CIPM, in 2002, decided to change the explanation of the quantity dose equivalent in the SI Brochure (**Recommendation 2 (CI-2002) of the CIPM, p. 182**).
(11) is now distinct ticket #17
Based on the SI Brochure layout: https://www.bipm.org/utils/common/pdf/si-brochure/SI-Brochure-9-EN.pdf