dompdf / php-font-lib

A library to read, parse, export and make subsets of different types of font files.
GNU Lesser General Public License v2.1
1.73k stars 256 forks source link

How does subsetting work? #108

Closed keto33 closed 2 years ago

keto33 commented 2 years ago

I tried to create a simple subset by

require_once '/home/keto/vendor/autoload.php';
use FontLib\Font;
use FontLib\BinaryStream;

$subset = str_split("test");
$subset = array_unique($subset);
sort($subset);

$font = Font::load("Arial.ttf");
$font->parse();

$font->setSubset($subset);
$font->reduce();

$tmp_name = "ArialSub.ttf";
touch($tmp_name);
$font->open($tmp_name, BinaryStream::modeReadWrite);
$font->encode(["OS/2"]);
$font->close();

it creates the TTF file without error, but the resulting font has no glyph. What am I missing here?

bsweeney commented 2 years ago

Just pass your string to the setSubset method. What it's looking for is a unicode codepoint array not an array of characters. The method already has the logic necessary to do the conversion and parse out duplicate values.

keto33 commented 2 years ago

Thanks! Worked beautifully. Just a stupid question. What is the right way to save the resulting TTF in a string instead of

$font->open($tmp_name, BinaryStream::modeReadWrite);

bsweeney commented 2 years ago

The existing functionality doesn't provide for that use case. If you wanted to avoid writing to disk you might try writing to a memory stream, as does the library itself here, then read that into a variable.

keto33 commented 2 years ago

Sorry for troubling you, but I am struggling to write the output to a memory stream. We cannot simply pass php://temp to

$font->open($tmp_name, BinaryStream::modeReadWrite); If we open the memory stream by

fopen("php://temp", "rb+"); how should we pass it to the function?

Just out of curiosity, isn't the main purpose of this program to subset/process fonts for embedding in PDF? Then, why is the default behavior to save the TFF font on the disk?

bsweeney commented 2 years ago

That's not the sole purpose but certainly the primary purpose. The primary use case was developed around Dompdf and so usage is written for how that library operates.

Regardless, you can write to a memory stream and capture the content in a variable using the following logic:

$font = FontLib\Font::load($font_file);
if ($font instanceof FontLib\TrueType\Collection) {
    $font = $font->getFont(0);
}
$font->parse();
$font->setSubset("abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ.:,;' (!?)+-*/== 1234567890");
$font->reduce();
$fp = fopen("php://memory", "rb+");
$font->setFile($fp);
$font->encode(array("OS/2"));
rewind($fp);
$subset = stream_get_contents($fp);

or using only php-font-lib functions

$font = FontLib\Font::load($font_file);
if ($font instanceof FontLib\TrueType\Collection) {
    $font = $font->getFont(0);
}
$font->parse();
$font->setSubset("abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ.:,;' (!?)+-*/== 1234567890");
$font->reduce();
$font->open("php://memory", FontLib\BinaryStream::modeReadWrite);
$font->encode(array("OS/2"));
$size = $font->pos();
$font->seek(0);
file_put_contents($font_subset_filename, $font->read($size));
$font->close();
keto33 commented 2 years ago

I was unable to make it work. I tried (I added encode to your code)

$font = FontLib\Font::load($font_file);
if ($font instanceof FontLib\TrueType\Collection) {
    $font = $font->getFont(0);
}
$font->parse();
$font->setSubset("test");
$font->reduce();
$tmp_name = 'x1.ttf';
touch($tmp_name);
$font->open($tmp_name, BinaryStream::modeReadWrite);
$font->encode(["OS/2"]);

$font = FontLib\Font::load($font_file);
if ($font instanceof FontLib\TrueType\Collection) {
    $font = $font->getFont(0);
}
$font->parse();
$font->setSubset("test");
$font->reduce();
$fp = fopen("php://memory", "rb+");
$font->setFile($fp);
$font->encode(["OS/2"]);
rewind($fp);
file_put_contents('x2.ttf',stream_get_contents($fp));

The second output is a smaller and corrupted file.

bsweeney commented 2 years ago

You're right I did forget that command. Updated my example so it doesn't trip up anyone else.

But yes, I do see that a file created using the memory stream is corrupted. The file looks mostly correct but the file header wrong. I don't have an answer for what's going on right now.