Open ronaldtse opened 5 months ago
This issue turns to be much harder to do, due to Coradoc structure being a single linked tree, but not only that. The incoming document is not semantic HTML - each list item is <ul><li>list item</li></ul>
only having indentation level specified via CSS. So do the non-list paragraphs - the CSS property sometimes does mean just indentation, sometimes it means a paragraph to supplant a list item. Despite sounding like a simple issue, it is far lot more complicated, while the logic we have currently is already complicated enough.
I have spent numerous hours pondering about the best solution for that and I suppose the best way to handle this issue will be in the linter, with linter having to reparse the document. If we go with that solution, we will have to have a complete parser that will be able to parse the entire incoming document.
+1 on making this a preprocessing task of some kind: your software cannot be made to deal with all the inanity that people will come up with in their HTML, nor assume that the HTML you will see will be clean and well-structured.
I don't have the exact HTML on hand to discuss.
However, let's use the following sample HTML to demonstrate what I mean:
<ul>
<ul><li>list item 1</li></ul>
<ul><li>list item 2</li></ul>
<ul>
<ul><li>list item 3</li></ul>
</ul>
</ul>
I am purposefully ignoring CSS here.
If you naively consider <ul>
as 1 level, you will see
The corresponding AsciiDoc representation is:
* // blank
** list item 1
* // blank
** list item 2
* // blank
** // blank
*** list item 3
* list item 1
* list item 2
* list item 3
In any case, it never becomes this (which is current behavior):
** list item 1
** list item 2
*** list item 3
Then we have two different issues.
The issue described in first post is specifically only for PLATEAU documents and this code is not generated from <ul><li>
but from their CSS classes. And this code is not generated by reverse_adoc proper, but by PLATEAU plugin.
I used this command to test the document:
File:
plateau/sections/section-09/section-02.adoc
:In AsciiDoc, lists always start at the first level:
So this produced text is technically invalid AsciiDoc.
It should be: