kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.91k stars 900 forks source link

Mermaid rendering on RTD is brittle #3395

Closed astrojuanlu closed 10 months ago

astrojuanlu commented 10 months ago

Recently I've been spotting some intermittent failures caused by the Mermaid rendering:

WARNING: mermaid code 'flowchart TD\n  A[Start] --> B{Do you prefer developing your projects in notebooks?}\n  B -->|Yes| C[Use a Databricks workspace to develop a Kedro project]\n  B -->|No| D{Are you a beginner with Kedro?}\n  D -->|Yes| E[Use an IDE, dbx and Databricks Repos to develop a Kedro project]\n  D -->|No| F{Do you have advanced project requirements<br>e.g. CI/CD, scheduling, production-ready, complex pipelines, etc.?}\n  F -->|Yes| G{Is rapid development needed for your project needs?}\n  F -->|No| H[Use an IDE, dbx and Databricks Repos to develop a Kedro project]\n  G -->|Yes| I[Use an IDE, dbx and Databricks Repos to develop a Kedro project]\n  G -->|No| J[Use a Databricks job to deploy a Kedro project]': Mermaid exited with error:
[stderr]
b'\nTimeoutError: Timed out after 30000 ms while waiting for the WS endpoint URL to appear in stdout!\n    at ChromeLauncher.launch (file:///home/docs/.asdf/installs/nodejs/19.0.1/lib/node_modules/@mermaid-js/mermaid-cli/node_modules/puppeteer-core/lib/esm/puppeteer/node/ProductLauncher.js:119:23)\n    at async run (file:///home/docs/.asdf/installs/nodejs/19.0.1/lib/node_modules/@mermaid-js/mermaid-cli/src/index.js:404:19)\n    at async cli (file:///home/docs/.asdf/installs/nodejs/19.0.1/lib/node_modules/@mermaid-js/mermaid-cli/src/index.js:184:3)\n\n'
[stdout]
b''

https://readthedocs.org/projects/kedro/builds/22757197/

astrojuanlu commented 10 months ago

Looks like the pupetteer version pinned by mermaid-cli is quite old https://github.com/mermaid-js/mermaid-cli/issues/627

astrojuanlu commented 10 months ago

Upstream issue https://github.com/puppeteer/puppeteer/issues/10556

Ways we can try to fix this:

We'd need a package.json for this https://github.com/mermaid-js/mermaid-cli/issues/627#issuecomment-1853769076

astrojuanlu commented 10 months ago

Or commit a pre-rendered PNG to the source tree and remind ourselves to re-render it when the source changes (which doesn't happen that often??)

stichbury commented 10 months ago

I like that option TBH have never been a big fan of using Mermaid to build dynamic graphics when a PNG will do.

stichbury commented 10 months ago

I want to make some changes to the docs that use the mermaid graphic anyway so will factor in a change to the graphics at the same time.

astrojuanlu commented 10 months ago

sphinxcontrib-mermaid doesn't seem to support static pre-rendering https://github.com/mgaitan/sphinxcontrib-mermaid/issues/134

stichbury commented 10 months ago

I've been looking at tools for this like https://discourse.joplinapp.org/t/ability-to-save-export-mermaid-graphs-as-image/23491

astrojuanlu commented 10 months ago

Yeah we can actually take the output from build/html and paste it in our images directory. There is a number of workarounds, but it looks like we'd lose the convenience of having the Mermaid code there.

stichbury commented 10 months ago

We can put it in a comment though, right?

astrojuanlu commented 10 months ago

That's where my mind was right now...

Essentially we can call mermaid-cli ourselves once with these parameters, store the PNG under version control, and have the Mermaid code as a comment next to the diagram itself.

https://github.com/kedro-org/kedro/blob/543fa8b359160591243eb363328e25a517d9a9bc/docs/source/conf.py#L554

I'll send a PR with this.