yob / pdf-reader

The PDF::Reader library implements a PDF parser conforming as much as possible to the PDF specification from Adobe.
MIT License
1.81k stars 271 forks source link

Extra nil safety in PageState #463

Closed yob closed 2 years ago

yob commented 2 years ago

Some of the text state operators mutate instance variables like @text_matrix and @text_line_matrix, and assume the instance variables exist.

In valid PDFs the text state operators that do so are within BT text block, so it's fine for us to initialize the instance variables there.

However, invalid PDFs (like those genrated by the fuzzer) might have text state operators without a preceding BT operator. For safety, initialize these vars in the constructor.

yob commented 2 years ago

Alternatively, we could make text state operators a no-op when outside a BT block. 🤔

yob commented 2 years ago

at the very least, I might add some PageState unit specs for calling the text state operators out of order, just to confirm they don't raise