Open m-kemarskyi opened 3 months ago
I've tried to come up with the custom PDFUnicodeString
class but it didn't worked out:
export class PDFUnicodeString extends PDFObject {
// The PDF spec allows newlines and parens to appear directly within a literal
// string. These character _may_ be escaped. But they do not _have_ to be. So
// for simplicity, we will not bother escaping them.
static of = (value: string) => new PDFUnicodeString(value);
private readonly value: string;
private constructor(value: string) {
super();
this.value = value;
}
asBytes(): Uint8Array {
return new TextEncoder().encode(this.value)
}
asString(): string {
return this.value;
}
clone(): PDFUnicodeString {
return PDFUnicodeString.of(this.value);
}
toString(): string {
return `(${this.value})`;
}
sizeInBytes(): number {
return new TextEncoder().encode(this.value).length + 2;
}
copyBytesInto(buffer: Uint8Array, offset: number): number {
buffer[offset++] = 40;
const encodedValue = new TextEncoder().encode(this.value);
buffer.set(encodedValue, offset);
offset += encodedValue.length;
buffer[offset++] = 41;
return encodedValue.length + 2;
}
}
UPD: PDFHexString
class solves the problem: PDFHexString.fromText(YOUR_TEXT)
What were you trying to do?
I was trying to add a comment to PDF with cyrillic letters.
How did you attempt to do it?
What actually happened?
It turned out that one-byte per characters is used under the hood (see the result on the screenshot)
What did you expect to happen?
I expected UTF-8 characters to work correctly.
How can we reproduce the issue?
Try to add the comment to PDF file using the code I've provided
Version
1.17.1
What environment are you running pdf-lib in?
Node
Checklist
Additional Notes
No response