vsch / flexmark-java

CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
BSD 2-Clause "Simplified" License
2.26k stars 269 forks source link

The actual flexmark parser is slower than the parser from version 0.50.50 #538

Open ahmad2702 opened 1 year ago

ahmad2702 commented 1 year ago

In my project I generate the DOCX documents with the content coming from the long markdown files. With the actual flexmark (i.e. 0.64.0) the parser needs ~40-45 seconds to parse the content from the markdown file. But with flexmark 0.50.50 it takes around ~10-15 seconds. This means that the old version works 3 times faster than the actual version.

To Reproduce

The md-flexmark.md can be used with following snippet:

WordprocessingMLPackage doc = getMyMLPackageFromEmtpyLocalDocument();

String rawMarkdown = getMyMarkdownContentFromIssue().repeat(1000);

List<Object> extensions = Arrays.asList(DefinitionExtension.create(), EmojiExtension.create(), FootnoteExtension.create(),
            StrikethroughSubscriptExtension.create(), InsExtension.create(), TablesExtension.create(),
            TocExtension.create(), SimTocExtension.create(), WikiLinkExtension.create());

MutableDataSet options = new MutableDataSet();
options.set(Parser.EXTENSIONS, extensions);
options.set(DocxRenderer.SUPPRESS_HTML, true);

Parser parser = Parser.builder(options).build();
DocxRenderer RENDERER = DocxRenderer.builder(options).build();

Node document = parser.parse(rawMarkdown);

RENDERER.render(document, doc);

Path path = Paths.get(System.getProperty("user.home") + "/Desktop/DocumentWithContentFromMarkdown.docx");
doc.save(path.toFile());

Time analysis With the Visual VM it was possible to get more information about the place in code that takes a long time: visual-vm-flexmark

Expected behavior: The parser works with the same speed as it did before (i.e. 0.50.50).