gen2brain / go-fitz

Golang wrapper for the MuPDF Fitz library
GNU Affero General Public License v3.0
369 stars 87 forks source link

Broken XREF Repair Crash #68

Closed nouritsu closed 1 year ago

nouritsu commented 1 year ago

My test data contains a file with a broken xref table. The code appears to crash when I call the fitz.New() method on my file.

Here is the output-

error: cannot find startxref
warning: trying to repair broken xref
warning: repairing PDF document
error: array not closed before end of file
error: aborting process from uncaught error!

My function let's call it read_pdf() takes in a file path and returns (string, error). It is completely functional for all types of pdfs (text, images etc.). However when a corrupted file is passed it panics and halts program execution.

I have already tried putting in a

defer func() {
    if err := recover(); err != nil {
        logrus.Errorf("Error parsing PDF file %v", fpath)
    }
}()

inside my read_pdf function but it appears the function still panics. The panic occurs in fitz.New() for sure because any lines after it are not executed.

gen2brain commented 1 year ago

This seems to be a duplicate of https://github.com/gen2brain/go-fitz/issues/57.

gen2brain commented 1 year ago

This should be fixed in https://github.com/gen2brain/go-fitz/commit/f15918d712b55047e829b1fe6ce0b46651ca0e40.