vsch / flexmark-java

CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
BSD 2-Clause "Simplified" License
2.25k stars 265 forks source link

customized HTML_BLOCK_TAGS Parser option seems not taken into account #254

Closed pi-r-p closed 6 years ago

pi-r-p commented 6 years ago

Hi,

I tried to add a custom block tag, to allow code insertion in a custom webcomponent. Here is my test.md file:

# test

<warp10-warpscript-widget>
<%
  <'
  This **code** is not markdown
  $list <% %> FOREACH

//There should not be no <p> here

%>
</warp10-warpscript-widget>

**markdown again**

I use these options :


    ArrayList<String> userTags = new ArrayList<>(Parser.HTML_BLOCK_TAGS.getFrom(null));
    userTags.add("warp10-warpscript-widget");

    DataHolder options = new MutableDataSet()
      .set(HtmlRenderer.INDENT_SIZE, 2)
      .set(Parser.HTML_BLOCK_DEEP_PARSER,true)
      .set(Parser.HTML_BLOCK_DEEP_PARSE_BLANK_LINE_INTERRUPTS,false)
      .set(Parser.HTML_BLOCK_DEEP_PARSE_FIRST_OPEN_TAG_ON_ONE_LINE,true)
      .set(Parser.HTML_BLOCK_DEEP_PARSE_BLANK_LINE_INTERRUPTS_PARTIAL_TAG,false)
      .set(Parser.HTML_BLOCK_TAGS,userTags)  //TODO : does not work
      .set(TablesExtension.COLUMN_SPANS, false)
      .set(TablesExtension.APPEND_MISSING_COLUMNS, true)
      .set(TablesExtension.DISCARD_EXTRA_COLUMNS, true)
      .set(TablesExtension.HEADER_SEPARATOR_COLUMN_MATCH, true)
      .set(TablesExtension.CLASS_NAME, "table table-striped table-sm")
      .set(WikiLinkExtension.IMAGE_LINKS, true)
      .set(Parser.EXTENSIONS,
        Arrays.asList(
          TablesExtension.create(),
          AttributesExtension.create(),
          TocExtension.create(),
          AutolinkExtension.create(),
          AsideExtension.create()
        )
      );

Whese these options, if I include my code in

, it works. But custom HTML_BLOCK_TAGS has no effect.

<div>
<warp10-warpscript-widget>

//There should not be no <p> here
**no md**

%>
</warp10-warpscript-widget>
</div>

I used 0.32.18 for my tests. I can't find out what I did wrong... In my tests, It seems I need the html deep parser to get rid of the <p> in the output.

vsch commented 6 years ago

@pi-r-p, you need html deep parser because CommonMark standard ends an HTML block on a blank line. Deep html parsing closes the HTML block on a matching closing tag.

pi-r-p commented 6 years ago

OK, that's close to my guess. And up to code, deep parser does not take into the customs HTML_BLOCK_TAGS.

Is it easy for you to add HTML_BLOCK_TAGS as an input of the html deep parser ? I may be able to do it myself and make a PR, but it will took me a lot of time to dive in your project.

If you do not want to add this feature, just close this issue ! If you are open to the idea but no time to do it, would you accept a PR ?

vsch commented 6 years ago

@pi-r-p, PR's are always welcome.

I am adding this little bit and making a release since I forgot to pass html block tags to deep parser constructor.

BTW, the block tags should contain all default tags since it is a complete set not only additional tags. So you need to create a new set with default tags new HashSet<String>(Parser.HTML_BLOCK_TAGS.getFrom(null)) then add your tags.

You only need to add your tags to block tags if you allow starting an HTML block with your tag since in your example you start with div which creates an HTML block.

vsch commented 6 years ago

@pi-r-p, repo updated, maven update in progress

pi-r-p commented 6 years ago

@vsch Crazy good reactivity ! OK, I will test in a few hours.

pi-r-p commented 6 years ago

It's now working perfectly with the same source code as in my first post. Many thanks for your reactivity !