GreyWyvern commented 5 months ago

When a document doesn't include a BaseEncoding header, StandardEncoding should be assumed as the default instead of an empty string.

Type of pull request

[X] Bug fix (involves code and configuration changes)

About

Some documents which are short-and-sweet may not include a BaseEncoding header. In this case, the PDF Reference 1.7 describes this encoding as a default.

Chapter 5, page 426:

Latin-text font programs produced by Adobe Systems use the Adobe standard encoding, often referred to as StandardEncoding. The name StandardEncoding has no special meaning in PDF, but this encoding does play a role as a default encoding.

Section 5.5, page 431:

If the Encoding entry is a dictionary, the table is initialized with the entries from the dictionary's BaseEncoding entry (see Table 5.11). Any entries in the Differences array are used to update the table. Finally, any undefined entries in the table are filled using StandardEncoding.

If the result of checking for the BaseEncoding returns an empty string, use StandardEncoding as the value instead. Resolves #665.

Checklist for code / configuration changes

[X] Please add at least one test case (unit test, system test, ...) to demonstrate that the change is working. If existing code was changed, your tests cover these code parts as well.
[X] Please run PHP-CS-Fixer before committing, to confirm with our coding styles. See https://github.com/smalot/pdfparser/blob/master/.php-cs-fixer.php for more information about our coding styles.
[X] In case you fix an existing issue, please do one of the following:
- [X] Write in this text something like fixes #1234 to outline that you are providing a fix for the issue #1234.

GreyWyvern commented 5 months ago

PHP CS Fixer is complaining about indentation in Document.php, PDFObject.php and RawData\RawDataParser.php. Files I didn't even modify. :( Running PHP CS Fixer on my local (Windows) machine doesn't find these issues either.

k00ni commented 5 months ago

I merged #670 into master which fixes these coding style issues. Please merge master in to get rid of them.

k00ni commented 5 months ago

Thank you!

smalot / pdfparser

Baseencoding fallback #669

Type of pull request

About

Chapter 5, page 426:

Section 5.5, page 431:

Checklist for code / configuration changes