yWorks / svg2pdf.js

A javascript-only SVG to PDF conversion utility that runs in the browser. Brought to you by yWorks - the diagramming experts
MIT License
654 stars 101 forks source link

Performance for multi page export #84

Closed scottctr closed 5 years ago

scottctr commented 5 years ago

Just a question about performance and use of resources. I'm using the code below to export to a multi-page PDF. Works on my machine -- takes about a minute to export a ~300 node graph. Unfortunately, on machines with less RAM and CPU, it takes about 15 minutes and pretty much locks up the machine while it's working. Any suggestions to make this process more efficient and prevent locking up users' machines?

        for (let verticalPageIndex = 0; verticalPageIndex < pagesHighWithMargins; verticalPageIndex++) {
            for (let horizontalPageIndex = 0; horizontalPageIndex < pagesWideWithMargins; horizontalPageIndex++) {
                jsPdf.addPage();

                const svgXOffset = contentWidth * horizontalPageIndex;
                const svgYOffset = contentHeight * verticalPageIndex;

                // use an svg view box to define a single page of the graph
                svgElement.setAttribute('viewBox', this.getViewboxAttribute(svgXOffset, svgYOffset, contentHeight, contentWidth));
                jsPdf.advancedAPI();
                // clip the pdf page so the page fits with the margins
                jsPdf.rect(margin, margin, contentWidth, contentHeight).clip().discardPath();
                svg2pdf(svgElement, jsPdf, {
                    xOffset: margin,
                    yOffset: margin,
                    scale: 1
                });
                jsPdf.compatAPI();

                // add a row and column hint in the bottom margin to help reassemble printed pages into the original graph
                jsPdf.setFontSize(Defaults.PdfRowColumnHintFontSize);
                jsPdf.text('Row ' + (verticalPageIndex + 1) + ' / Column ' + (horizontalPageIndex + 1), contentHeight, contentHeight + 5 + (margin * 2), { baseline: 'bottom' });
            }
        }
HackbrettXXX commented 5 years ago

I can reproduce your performance issues. I tested a graph with ~400 nodes myself and the export took about 10 seconds for a single page on my (fast) machine. Multiplied by the number of pages this is similar to your 1 minute.

The problem is that svg2pdf processes the whole graph for each page again (and each page of the pdf also contains the whole graph - just clipped), which is obviously expensive.

I don't think we can optimize svg2pdf/jsPDF's code much more by itself. We did optimize the performance some time ago and while we might get a few percentages there, it won't solve the issue.

One optimization we could do would be omitting elements that are not in the current viewBox/clipPath of the SVG. This would at least reduce the pdf file size and might also save some time during the export.

What you could try is to prune the SVG created by the SvgExport beforehand (using worldBounds), so svg2pdf only runs for the parts of the graph visible on the current page. This is probably faster.

Concerning the locking: Unfortunately you cannot run svg2pdf in a Webworker, as it requires access to the DOM. You might, however, use timeouts between the single pages, although I'm afraid this won't have a big impact on slow machines, as rendering a single page already takes quite long.

Hope this will help you!

scottctr commented 5 years ago

I added setTimeouts in my loop and that reduces the locking up of the machine -- thanks. Now for the pruning with worldBounds. If I understand, to use this technique I'll add setting the exporter's worldBounds and exportSvg() inside my loop. Any chance of you have a sample of this?

Also, is there is simple way to detect when there isn't any content on a particular page and prevent exporting it? In my current example, I have 86 pages spread across 2 rows. Across the whole 43 pages of the bottom row, there is only 1 partial port. Would be great to skip exporting the 42 pages with nothing on them.

One last thought I've been considering testing out. Besides the blank pages in my PDF export, I also don't like breaking nodes across pages. I believe that using the multi-page components would prevent both of those situations. Any potential performance improvements using the multi-page components for exporting PDFs?

HackbrettXXX commented 5 years ago

Here is a sample that should get you started:

  for (let verticalPageIndex = 0; verticalPageIndex < pagesHighWithMargins; verticalPageIndex++) {
    for (let horizontalPageIndex = 0; horizontalPageIndex < pagesWideWithMargins;horizontalPageIndex++) {
      jsPdf.addPage()

      const svgXOffset = contentWidth * horizontalPageIndex
      const svgYOffset = contentHeight * verticalPageIndex

      const targetRect = new Rect(svgXOffset, svgYOffset, contentWidth, contentHeight)

      const exporter = new SvgExport(targetRect, 1)
      exporter.margins = margins
      exporter.inlineSvgImages = true
      const svgElement = await exporter.exportSvgAsync(exportComponent)

      svg2pdf(svgElement, jsPdf, {
        xOffset: margin,
        yOffset: margin,
        scale: 1
      })
    }
  }

Concerning the empty pages: you could check if the SVG produced by the SvgExport for this page is empty and then skip this page.

Regarding your last question: I will forward it to the yFiles support team, as this is a yFiles for HTML related question and doesn't really belong in this channel.

scottctr commented 5 years ago

Replacing the jsPdf..discardPath with the SvgExporter for each page took my time from almost a minute to less than 3 seconds.

I was also able to identify the svgElements without any graph content and exclude them from the PDF.

Many thanks!!

HackbrettXXX commented 5 years ago

Great to hear!