vsch / flexmark-java

CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
BSD 2-Clause "Simplified" License
2.26k stars 267 forks source link

support fort letter ordered list (in core or extension) #386

Open terefang opened 4 years ago

terefang commented 4 years ago

Is your feature request related to a problem? Please describe.

CommonMark discussed Letter ordered lists here Letter-ordered lists

pandoc documents it here: ordered-lists see "fancy_lists" paragraph

Describe the solution you'd like support pandoc style letter ordered lists as either core or extension

Describe alternatives you've considered none

vsch commented 4 years ago

@terefang, I checked out the discussion for letter ordered lists and did not see a conclusion.

Implementing it as it is discussed will add significant ambiguity and false positives which need things like double space after letter numbered lists. These are kludges not solutions.

Why not use CSS and attributes extension to define what type of list should be used? This has the advantage of being part of the library and flexibility of using CSS definition instead of the HTML renderer.

For example, adding a class to a list {.lower-roman} would identify it as a list of lowercase roman numerals. In this case adding this attribute to the first item would assign the attribute to the list:

1. {.lower-roman} Item 1
2. Item 2

would render to HTML as:

<ol class="lower-roman">
  <li>Item 1</li>
  <li>Item 2</li>
</ol>

The rest would be up to the CSS to define how this class is rendered. Simplest would be:

ol.lower-roman {
    list-style-type: lower-roman;
}

BTW, changing the class to {.lr} for brevity would shorten the required tag and visual noise at the expense of mnemonic clarity.

What do you think of the workaround using attributes extension?

terefang commented 4 years ago

the issue is that i dont use any html nor css but use only the parser part of flexmark implementing my own custom pdf-renderer because the markdown is buried in xml page descriptions like this:

<?xml version="1.0" encoding="utf-8"?>
<document>
    <?font id="hvr" name="pdf:helvetica" charset="pdfdoc" ?>
    <?define
            markdown-paragraph-font="hvr"
            page-mediabox="595,842"
            ?>
    <page>
        <markdown pos="50,810" width="495"  >
<![CDATA[
## OPEN GAME LICENSE VERSION 1.0A

The following text is ...
]]></markdown>
        <markdown pos="50,770" width="240"  >
<![CDATA[
#### 1. Definitions

(a) "Contributors" means ...
(b) "Derivative Material" ...
(c) "Distribute" means ...
(d) "Open Game Content" means ...
(e) "Product Identity" means ...
(f) "Trademark" means ...
(g) "Use", "Used" or "Using" means...
(h) "You" or "Your" means the licensee in terms of this agreement

...

]]></markdown>
    </page>
</document>

using attributes is not feasable, because the users writing the markdown are non-STEM people expecting a rather natural writing flow (a sensible reduced set of commonmark).

and the pdf generator does only implement that reduced set of commonmark.

could this be implemented as an flexmark extension ? ie. recognizing paragraphs as ordered list items like "/^([a-z]+)\s\s+/" or "/^[a-z]+.\s\s+/" and conflating successive items into a block ?

and/or switching normal list item processing off ?