mermaid-js / mermaid-cli

Command line tool for the Mermaid library
MIT License
2.2k stars 207 forks source link

improve performance of markdown file processing #694

Closed meizy closed 1 month ago

meizy commented 1 month ago

when you have a lot of markdown files to process with mmdc (using xargs or similar), and only some of them contain mermaid diagrams, it takes relatively a lot of time to process those .md files that do not contain any diagram.

regex processing is relatively heavy. it might accelerate performance if, before processing regex to extract the diagrams from the file, the script will do a simple check if the file contains any ```mermaid string.

it seems like a small change in index.js, but I'm not proficient enough in my js to write the code.

For your consideration.

LeonKuhne commented 1 month ago

yeah that ^^^ and also a switch from puppeteer to playwright would allow processing multiple images (pages) at once -- or just skip that step and use a custom renderer (or any lib that doesn't require me to install chromium pleeeease!)

aloisklink commented 1 month ago

when you have a lot of markdown files to process with mmdc (using xargs or similar), and only some of them contain mermaid diagrams, it takes relatively a lot of time to process those .md files that do not contain any diagram.

Good idea! I think the slowest part is actually launching puppeteer! I've made a PR to lazy-load it so it only gets loaded if needed: https://github.com/mermaid-js/mermaid-cli/pull/696


switch from puppeteer to playwright would allow processing multiple images (pages) at once

mermaid-cli and Puppeteer already supports that! For example, if you have one markdown file with multiple mermaid diagrams in it, it will render them in parallel using a single browser instance.

However, it still needs to create a browser instance per .md file, which is slow if you have lots of .md files.

You can use remark-mermaid-dataurl if you want to process multiple .md files. It's much much faster, since it only uses a single browser instance.

There's also remark-mermaidjs, which is similar, but uses Playwright instead of Puppeteer.

or just skip that step and use a custom renderer (or any lib that doesn't require me to install chromium pleeeease!)

Unfortunately, Mermaid needs a CSS layout engine to render properly, and as far as I'm aware, only browsers support this. See https://github.com/mermaid-js/mermaid/issues/3650.

Although, maybe https://github.com/servo/servo will help in the far future :shrug:

You can use Puppeteer with Firefox, though: https://pptr.dev/faq#q-what-is-the-status-of-cross-browser-support