yWorks / svg2pdf.js

A javascript-only SVG to PDF conversion utility that runs in the browser. Brought to you by yWorks - the diagramming experts
MIT License
643 stars 96 forks source link

How to ignore invalid dataurl #262

Closed tangenttechno closed 11 months ago

tangenttechno commented 11 months ago

I am using Apache PDFbox to generate SVG from PDF, but the generated SVGs can have invalid PNG dataurl, when I try to export to pdf using doc.svg() I am getting the error 'addImage does not support files of type 'UNKNOWN', please ensure that a plugin for 'UNKNOWN' support is added'. how to ignore this error if dataurl is not valid?

yGuy commented 11 months ago

I don't think there is a way. It's hard enough already to render valid SVGs properly. This library does not have the goal to validate and fix invalid SVGs. So I propose a preprocessing that removes or fixes or whatever is appropriate the broken images before passing it to this library. Maybe you can fix Apache PDFbox, instead?

tangenttechno commented 11 months ago

sample @yGuy Thank you for the quick reply, one thing I noticed is library is trying to fetch invalid dataURL, please see attached SVG for the original, below is the URL library is trying to fetch

%0AMBBFsyh4A8Vd915Aj2oPUc9U7Dk6/bMohGFak1GZQPPgI2rmTRA1CaFSqZwAUbgg%0AVz7Ke6eChiNyQ54IReFzvj7KmsOAvEPuovG38LhOOnYB4YA8lGa/wuMH6TIBUWuY%0AQDyRVjqzgWRR5DlZpDMLCCZFaskk3cmgeFaElszSnQyKV0VoySrdSaCwQV6K0BL2%0ANLLHJijqFdme9LLHJlTCk2DI+51gqJCvw/8/wZD3H5OhEtYOhrxX0Q/kvZ+IIc+d%0AlQZ57TErlb/hDbwZp8G79kS6AAAAAElFTkSuQmCC;base64,iVBORw0KGgoAAAANSUhEUgAAACEAAAAhCAYAAABX5MJvAAAA1UlEQVR4Xu2WQQqDMBBFsyh4A8Vd915Aj2oPUc9U7Dk6/bMohGFak1GZQPPgI2rmTRA1CaFSqZwAUbggVz7Ke6eChiNyQ54IReFzvj7KmsOAvEPuovG38LhOOnYB4YA8lGa/wuMH6TIBUWuYQDyRVjqzgWRR5DlZpDMLCCZFaskk3cmgeFaElszSnQyKV0VoySrdSaCwQV6K0BL2NLLHJijqFdme9LLHJlTCk2DI+51gqJCvw/8/wZD3H5OhEtYOhrxX0Q/kvZ+IIc+dlQZ57TErlb/hDbwZp8G79kS6AAAAAElFTkSuQmCC

Screenshot 2023-07-21 at 3 59 54 PM Screenshot 2023-07-21 at 4 00 31 PM

all the invalid URL has two 'data:image' appended which is not in the original, is it related to the library since what I am doing is downloading SVG and exporting to PDF.

`

`

HackbrettXXX commented 11 months ago

The "data:image" is added here: https://github.com/yWorks/svg2pdf.js/blob/eafbd3d52349c9ba341213cfab60ebb368f7d020/src/nodes/image.ts#L77

And I suppose this method fails when trying to parse the broken data URL: https://github.com/yWorks/svg2pdf.js/blob/eafbd3d52349c9ba341213cfab60ebb368f7d020/src/nodes/image.ts#L107-L135

However, I agree with @yGuy. You should pass valid URLs to svg2pdf. Svg2pdf cannot detect or fix broken URLs.

tangenttechno commented 11 months ago

@HackbrettXXX Thank you for the reply, if we inspect the SVG file I attached, we cannot find any png data url given above. Not sure how it came, if you execute the simple sample code i have provided, you will get the same error. Can you try?

yGuy commented 11 months ago

I checked the SVG file you sent and the problematic data URL is likely only problematic due to the new lines in the attribute value. This results in whitespaces being inserted by the XML parser and the regex won't match anymore. It seems browsers are pretty lenient here and base64 strings should just be stripped from any whitespace before being interpreted.

We could/maybe should add this functionality to svg2pdf. But of course the workaround is to preprocess and parse the svg dom, select all image hrefs and strip away the whitespace in the data urls. That should make the lib happy.

tangenttechno commented 11 months ago

@yGuy Thank you for the reply. Yes, your solution worked. I minified the SVG and now its rendering fine. Thank you.