ArtifexSoftware / Ghostscript.NET

Ghostscript.NET - managed wrapper around the Ghostscript library (32-bit & 64-bit)
https://ghostscript.com
GNU Affero General Public License v3.0
395 stars 152 forks source link

Ghostscript.Net fails to open a PDF file whose name includes strange whitespace character #64

Closed KeithVinson closed 3 years ago

KeithVinson commented 5 years ago

Hello all, I found a strange case. I was using a tool to split a large PDFs into individual sections by splitting the source PDF by the first level bookmark. Turns out the bookmark text had some whitespace characters in it that causes Ghostscript.Net to fail when trying to load the PDF.

Yes, I know technically this is not on you, but maybe you should consider cleaning up the strange whitespace characters as you are opening a PDF file.

The tool I was using split a PDF in sections by bookmarks created file names that looked like this: [FILENUMBER]-[BOOKMARK_NAME].

The runtime bookmark name extracted from the PDF contained a 0xA0 character in lieu of a 0x20 in a few places. I had a devil of a time trying to figure this out, every tool I used to examine the file name "correctly" showed the file name where the 0xA0 was as a single space. Only when I replaced the 0xA0 with a 0x20 did Ghostscript.Net successfully open the file.

Turns out 0xA0 is a non breaking white space (who knew there was such a thing in ASCII) https://en.wikipedia.org/wiki/Non-breaking_space. So from a typesetting perspective having the section name flagged as being rendered as a unbroken line makes perfect sense. But allowing that nuance to cause Ghostscript.Net to fail to open the file might be worthy of look into.

Cheers...

ygl365166495 commented 4 years ago

Maybe the filename of PDF needs to be in double quotes.

jhabjan commented 3 years ago

Fixed in today's v.1.2.2. release.