modelmat / sphinxcontrib-drawio

Sphinx extension for including draw.io files.
MIT License
39 stars 17 forks source link

Doesn't seem to convert in parallel #67

Open jdillard opened 2 years ago

jdillard commented 2 years ago

It doesn't seem the extension converts drawio files one at a time, even when Sphinx is run in parallel mode. I thought adding "parallel_write_safe": True might help, but that doesn't seem to have an effect. Any ideas on how to convert drawio files in a parallel manner? On Sphinx projects with a large count of drawio diagrams this can become quite the performance bottleneck.

modelmat commented 2 years ago

You will have to run sphinx with https://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-j.

image

This defaults to 1.

I haven't ever seen this really used, and haven't ever tested it myself, so I don't know how well it will go, or whether issues will come up from multiple instances at once.

I will note that if build times are an issue, committing editable PNGs or SVGs (mentioned in this project's README) might be a better option for you, as it removes the need for this extension entirely & the conversion process. I believe this extension should also be caching the built images, too...

jdillard commented 2 years ago

The more I look into this, the more I realize I don't know how to test for success. We are running Sphinx in parallel, but I can't say for sure if the drawio portion is or not, so I'd take this as a grain of salt for now. The cache is great, but it only works on incremental builds (since the cache has to be built the first time around), but I have thought about having the option to host the cached files somewhere and pull those down (not sure the feasibility yet though).

I'll keep digging and look into the editable PNGs or SVGs as well and post any relevant updates, for clarity's sake. Thanks!

Edit: I think the limiting factor is that drawio is spun up for each conversion, vs doing a bulk conversion (using a folder) vs a single file. There are some limitations to that including only being able to pass one set of options to the whole folder and not being able to convert specific pages in a single file.

niziak commented 2 months ago

You will have to run sphinx with https://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-j.

image

This defaults to 1.

This speedups doc generation a lot but drawios are still generated one by one (which is slow).

jdillard commented 2 months ago

I've been exploring a couple alternatives recently and one is using the drawio batch (folder) convert feature which saves a couple minutes (~25% time savings) with 150+ drawio files to convert. It seems like batch conversion itself could have better performance, but that is a drawio-desktop issue. You can use something this bash script to convert files in a source directory one by one versus converting the source directory in batch if you want to compare yourself:

#!/bin/bash

# Directory containing the files
DIRECTORY="./source"

# Loop through each file in the directory
for FILE in "$DIRECTORY"/*
do
  # Call the drawio executable with the current file
  /opt/drawio/drawio --export --crop --page-index 0 --scale 1.0 --transparent --format png --output ./build "$FILE" --enable-logging --no-sandbox
done

I've also been trying to experiment with basically adding all the drawio command calls to a list and running those in parallel at the end of the build step and seeing if that out performs the batch conversion, but that implementation is a little more tricky.