mbattyani / cl-pdf

CL-PDF is a cross-platform Common Lisp library for generating PDF files.
Other
115 stars 41 forks source link

Non-ASCII characters in metadata #24

Open phmarek opened 3 years ago

phmarek commented 3 years ago

This doesn't work:

  (with-document (:author "Föö"))
    ...

The umlauts are written in UTF8 to the PDF; the PDF reference (3.8.1) specifies some UCS16be encoding to be used.

erikronstrom commented 3 years ago

It seems like this could easily be fixed by calling pdf-string on the values in add-doc-info:

(defun add-doc-info (doc &key (creator "") author title subject keywords)
  (setf (docinfo doc) (make-instance 'indirect-object))
  [...]
                      ,@(when author `(("/Author" . ,(pdf-string author))))
                      ,@(when title `(("/Title" . ,(pdf-string title))))
                      ,@(when subject `(("/Subject" . ,(pdf-string subject))))
                      ,@(when keywords `(("/Keywords" . ,(pdf-string keywords))))
  [...]

However, I'm a little bit suspicious about the implementation of pdf-string. It sets the unicode flag if the string contains characters with char-code above 255, but what about the non-ascii characters between 128 and 255? If a PDF string contains a single byte encoded string, it is supposed to be in PDFDocEncoding which is not the same as whatever encoding code-char uses.

I could make a pull request for this BUT I'm not sure if I have completely understood the purpose of pdf-string – I don't want to unintentionally break other stuff...