metanorma / mnconvert

Metanorma converter
2 stars 1 forks source link

STS -> MN: Incorrect sequence of list items shown in output #36

Open manuelfuenmayor opened 3 years ago

manuelfuenmayor commented 3 years ago

In relation to https://github.com/metanorma/metanorma-bsi/issues/30

Currently, there are several ordered lists that are showing incorrect sequence. To take an example from BS EN 12973: Capture_0

Original XML code:

<list list-type="order">
<list-item><label>3)</label><p><bold>The use of the FPS</bold></p></list-item></list>
<p>This paragraph particularly illustrates the implementation of the FPS in the case of the study of products of average complexity.</p>
<list list-type="alpha-lower">
<list-item><label>a)</label><p><bold>The evolution of the FPS and the progress of a project</bold></p></list-item></list>
<p>The enquirer writes a first version of the FPS that will be transmitted to potential suppliers to ascertain their position and obtain their response in terms of feasibility. This version, comprising the maximum of flexibilities, will be amended to take into account their remarks and suggestions.</p>
<p>A stabilized version is thus achieved that will be the basis for consultation or a call for tenders.</p>
<p>The different competitors will take advantage of the remaining flexibilities to prepare the proposal that seems to them the most satisfactory. These proposals, written according to the answer framework included in the FPS, will be objectively compared.</p>
<p>The last options having been specified, it may become necessary, before issuing the order for the supply of the product, to establish a technical specification.</p>

is converted to:

. *The use of the FPS*

This paragraph particularly illustrates the implementation of the FPS in the case of the study of products of average complexity.

. *The evolution of the FPS and the progress of a project*

The enquirer writes a first version of the FPS that will be transmitted to potential suppliers to ascertain their position and obtain their response in terms of feasibility. This version, comprising the maximum of flexibilities, will be amended to take into account their remarks and suggestions.

A stabilized version is thus achieved that will be the basis for consultation or a call for tenders.

The different competitors will take advantage of the remaining flexibilities to prepare the proposal that seems to them the most satisfactory. These proposals, written according to the answer framework included in the FPS, will be objectively compared.

The last options having been specified, it may become necessary, before issuing the order for the supply of the product, to establish a technical specification.

This generates an incorrect sequence in the output: Capture_1

To solve the issue, I'd recommend using the attributes type and start in every list item:

[type="arabic",start=3]
. *The use of the FPS*

This paragraph particularly illustrates the implementation of the FPS in the case of the study of products of average complexity.

[type="alphabetic",start=a]
. *The evolution of the FPS and the progress of a project*

The enquirer writes a first version of the FPS that will be transmitted to potential suppliers to ascertain their position and obtain their response in terms of feasibility. This version, comprising the maximum of flexibilities, will be amended to take into account their remarks and suggestions.

A stabilized version is thus achieved that will be the basis for consultation or a call for tenders.

The different competitors will take advantage of the remaining flexibilities to prepare the proposal that seems to them the most satisfactory. These proposals, written according to the answer framework included in the FPS, will be objectively compared.

The last options having been specified, it may become necessary, before issuing the order for the supply of the product, to establish a technical specification.

Now, these attributes seem not to be working for BSI flavor, but surely @opoudjis can help us with that, right?

opoudjis commented 3 years ago

Inasmuch as this is a violation of BS 0.2:2017 23.1, I'm reluctant to:

In a list of items requiring individual identification in the text, each item should be preceded by a lower-case letter followed by a single closing parenthesis, i.e. “a), b), c)”. If subdivision of an item is necessary, each subdivided item should be preceded by an Arabic numeral followed by a single closing parenthesis, i.e. “1), 2), 3)”. If a further level of subdivision is necessary, each subdivided item should be preceded by a lower-case roman numeral followed by a single closing parenthesis, i.e. “i), ii), iii)”, or by a bullet. Where a list is subdivided, the second level should be indented from the first, and the third level should be indented from the second.

If there is more than one list within a single numbered subdivision of the text, the second list should be numbered “1), 2), 3)” and the third “i), ii), iii)”. It is not advisable to have more than three lists within a single numbered subdivision of the text. Primary lists should not be indented.

I'm aware that the spec says should and not shall. I am still unwilling to implement this unless I can be persuaded that this is intentional, and appropriate.

And I so not believe that to be the case in BS EN 12973.

What Annex A is doing in this case with its numbering is daft, but it is NOT an ordered list. These are clearly ad hoc subheadings, introduced to get around the fact that BSI does not permit subheadings more than 4 levels deep: BS 0.2:2017 22.3.2.

In fact, if this deserves any formatting at all, it needs to be treated as a 5th level and 6th level heading (because that's clearly what is going on here), which needs to be rendered idiosyncratically by eliminating the first four levels, and smiling naughtily and pretending that it is in compliance with BS 0.2:2017 22.3.2.

And if any of the other instances are also bold faced, in a four level subheading, and introducing plain text paragraphs, then that's what we will need to do with them as well. Please investigate and let me know what the other instances look like.

opoudjis commented 3 years ago

This is a markup issue btw. It needs to go to metanorma-bsi...

... and if this distortion of markup is actually present in the STS XML... well, I'm sorry, but it still needs to be somehow converted to subheadings. These simply are not ordered lists. If it was, the sublists for a) would have been indented:

1)
  a)
  b)

and that would have immediately given away the fact that this document is cheating.

manuelfuenmayor commented 3 years ago

In addition, these "subheadings" are not exactly following the same style in the document:

In one section it shows: Capture_1

and, in another section: Capture_2

opoudjis commented 3 years ago

Well... that, we cannot and should not address: we will generate them consistently.

opoudjis commented 3 years ago

If there is a general rule that we can apply, that boldfaced list entries, followed by another boldfaced list entry, or a plain paragraph, are in fact subheadings, then I can take care of their rendering in Presentation XML.

@Intelligent2013 If you think you can realise this, make me a ticket in metanorma-bsi to process those subheadings in that way, so we get something like the PDF back.

manuelfuenmayor commented 3 years ago

@Intelligent2013 please put this task on hold until I get a general overview of the issue across the other BSI documents. I will let you know. Thanks!

manuelfuenmayor commented 3 years ago

@opoudjis, in bs-11000-2 there is another case of boldfaced list items: issue5-original

Generated: issue5-generated

Notice that this is shown after two level subheading instead of four (like above).

opoudjis commented 3 years ago

Look at the markup:

. *Principle*

There should be an established linkage between the activities of the organization and the potential role for collaboration in supporting those activities.

. *Why*

Collaboration might not be appropriate for all business relationships and the focus should be directed to where it can add value.

. *How*

This is a strategic decision that needs to be considered by top management to ensure that collaborative working, when appropriate, is fully integrated and supported.

That is being presented as three independent lists, not as a single list with three extended paragraphs:

. *Principle*
+
----
There should be an established linkage between the activities of the organization and the potential role for collaboration in supporting those activities.
----

. *Why*
+
----
Collaboration might not be appropriate for all business relationships and the focus should be directed to where it can add value.
----

. *How*
+
----
This is a strategic decision that needs to be considered by top management to ensure that collaborative working, when appropriate, is fully integrated and supported.
----

I don't know what the STS XML is here, but either it is the equivalent of:

<ol><li>Principle</li></ol> 
<p>There should be an established linkage....</p>

which is marked up wrong, or else it is:

<ol><li>Principle
<p>There should be an established linkage....</p>
</li>
</ol>

which is being translated wrong. Asciidoctor assumes that each list entry is only a paragraph long. If there is a multiparagraph list entry in Asciidoctor, it MUST be marked up using list continuation: https://docs.asciidoctor.org/asciidoc/latest/lists/continuation/

@Intelligent2013

Intelligent2013 commented 3 years ago

in bs-11000-2 there is another case of boldfaced list items:

Fixed: image

@opoudjis thanks for the information about list continuation.

manuelfuenmayor commented 3 years ago

@Intelligent2013, after checking all the BSI documents, it seems that BS EN 12973 is the only one that has this rare instance of ordered list-items. So, can you achieve what @opoudjis proposed in the comments above? (https://github.com/metanorma/mnconvert/issues/36#issuecomment-885653856, https://github.com/metanorma/mnconvert/issues/36#issuecomment-885655627, and https://github.com/metanorma/mnconvert/issues/36#issuecomment-885713137).

Basically, the goal is to convert these "ordered" list items into subheadings of level 5 and 6. Taking the sample I shared above in the description, we'd need to convert this:

. *The use of the FPS*

This paragraph particularly illustrates the implementation of the FPS in the case of the study of products of average complexity.

. *The evolution of the FPS and the progress of a project*

The enquirer writes a first version of the FPS that will be transmitted to potential suppliers to ascertain their position and obtain their response in terms of feasibility. This version, comprising the maximum of flexibilities, will be amended to take into account their remarks and suggestions.

A stabilized version is thus achieved that will be the basis for consultation or a call for tenders.

The different competitors will take advantage of the remaining flexibilities to prepare the proposal that seems to them the most satisfactory. These proposals, written according to the answer framework included in the FPS, will be objectively compared.

The last options having been specified, it may become necessary, before issuing the order for the supply of the product, to establish a technical specification.

to this:


====== The use of the FPS

This paragraph particularly illustrates the implementation of the FPS in the case of the study of products of average complexity.

[level=6]
====== The evolution of the FPS and the progress of a project

The enquirer writes a first version of the FPS that will be transmitted to potential suppliers to ascertain their position and obtain their response in terms of feasibility. This version, comprising the maximum of flexibilities, will be amended to take into account their remarks and suggestions.

A stabilized version is thus achieved that will be the basis for consultation or a call for tenders.

The different competitors will take advantage of the remaining flexibilities to prepare the proposal that seems to them the most satisfactory. These proposals, written according to the answer framework included in the FPS, will be objectively compared.

The last options having been specified, it may become necessary, before issuing the order for the supply of the product, to establish a technical specification.
ronaldtse commented 3 years ago

@manuel489 the STS XML was originally a list. This is clearly an error in the original encoding.

<list list-type="order">
  <list-item>
    <label>3)</label>
    <p><bold>The use of the FPS</bold></p>
  </list-item>
</list>

<p>This paragraph particularly illustrates the implementation of the FPS in the case of the study of products of average complexity.</p>

<list list-type="alpha-lower">
  <list-item>
    <label>a)</label>
    <p><bold>The evolution of the FPS and the progress of a project</bold></p>
  </list-item>
</list>

<p>The enquirer writes a first version of the FPS...</p>

This is a decision we cannot take lightly -- we are supposed to migrate the format, not changing list description into section headings. We will seek clarification from BSI, but for sure we do not want to change them to section headings right now.

@Intelligent2013 please do not action this ticket for now, we will keep it open.

ronaldtse commented 3 years ago

And this is a generic STS XML issue.

Intelligent2013 commented 3 years ago

@Intelligent2013 please do not action this ticket for now, we will keep it open

@ronaldtse ok.