phpdave11 / gofpdi

Go Free PDF Document Importer
MIT License
119 stars 59 forks source link

Failed to read xref table: Unsupported field size #25

Closed rorycl closed 4 years ago

rorycl commented 4 years ago

Hi. Here is an issue with v1.0.10:

panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Unsupported field size in cross-reference stream dictionary - only tested with /W [1 2 1]

goroutine 1 [running]:
github.com/phpdave11/gofpdi.(*Importer).SetSourceFile(0xc0000634c0, 0x5aad2a, 0xc)
    /home/rory/go/pkg/mod/github.com/phpdave11/gofpdi@v1.0.10/importer.go:69 +0x20d
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).ImportPage(0xc00000e580, 0x5e44a0, 0xc0000c6000, 0x5aad2a, 0xc, 0x1, 0x5aa1ee, 0x9, 0x4083b88b1c22f64f)
    /home/rory/go/pkg/mod/github.com/phpdave11/gofpdf@v1.4.1-0.20200211150905-48c1f7f2764c/contrib/gofpdi/gofpdi.go:43 +0x46
github.com/phpdave11/gofpdf/contrib/gofpdi.ImportPage(...)
    /home/rory/go/pkg/mod/github.com/phpdave11/gofpdf@v1.4.1-0.20200211150905-48c1f7f2764c/contrib/gofpdi/gofpdi.go:115
main.main()
    /home/rory/src/go-gofpdi-test/gofpdi-t2.go:32 +0x16c
exit status 2

Example file: http://www.campbell-lange.net/media/files/example3.pdf

phpdave11 commented 4 years ago

@rorycl thank you for the bug report and example PDF. I will update gofpdi to support variable field widths within a cross-reference stream. I'm planning on creating some unit tests that check to make sure the code can import all of the example PDFs you've provided so far.

rorycl commented 4 years ago

@phpdave11 thanks for your rapid response. Sorry not to contribute patches due to lack of time and knowledge about PDF xrefs.

Feel free to use the PDFs I have provided as part of a test suite. However, I intend to delete them from my server in the next month or so if that is ok.

By the way I'm also getting some other problems with PDFs exported from Word: panic: Failed to get page resources: Page 5 does not exist!!

and another from Google Docs: panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Failed to read prev xref: Unsupported /DecodeParms - only tested with /Columns 4 /Predictor 12

but I've been unable to recreate these issues on non-proprietary documents.

Thanks for your great work.

phpdave11 commented 4 years ago

Thanks! If you are willing to send me sample documents generated from Word and Google Docs that cause those panics, I can work on fixing those issues as well.

rorycl commented 4 years ago

@phpdave11 Unfortunately I'm having problems creating versions of confidential documents that exhibit the same import problems.

My register of current problems (as against v1.0.10) is as follows; interesting "Microsoft" doesn't appear in any of them.

0d3cb006-1052-463f-8fb8-8644aac4ccdb.pdf
panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Unsupported field size in cross-reference stream dictionary - only tested with /W [1 2 1]
Producer:       pdfTeX-1.40.18
Pages:          1
PDF version:    1.5                                                                            

27c8b519-ede4-4cbf-ad80-fdce671136c4.pdf                                                       
panic: Failed to get page resources: Page 5 does not exist!!                                   
Producer:       Skia/PDF m80                                                                   
Pages:          31
PDF version:    1.4                                                                            

58b71cd1-7814-42a2-b1dd-0145d16f69c4.pdf
panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Unsupported field size in cross-reference stream dictionary - only tested with /W [1 2 1]
Producer:       pdfTeX-1.40.19                                                                 
Pages:          1
PDF version:    1.5

6cb53357-0e6c-43bf-a978-264a296dfc43.pdf
panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Unsupported field size in cross-reference stream dictionary - only tested with /W [1 2 1]
Producer:       pdfTeX-1.40.18
Pages:          6
PDF version:    1.5

b6a8a830-0a67-4cd5-a0ba-095a255a0333.pdf
panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Unsupported /DecodeParms - only tested with /Columns 4 /Predictor 12 
Producer:       3-Heights(TM) PDF Security Shell 4.5.24.1 (http://www.pdf-tools.com)           
Pages:          400                                                                            
PDF version:    1.6

cb7456ac-e0a3-48b7-8824-854ace5236a0.pdf                                                       
panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Unsupported /DecodeParms - only tested with /Columns 4 /Predictor 12 
Producer:       mPDF 6.0                                                                       
Pages:          385
PDF version:    1.6

d46ced5f-1b26-431f-bafb-7dc2d68c3f43.pdf
panic: Failed to get page resources: Page 3 does not exist!!
Producer:       pdfTeX-1.40.18
Pages:          9
PDF version:    1.5

dcbd8836-1367-4312-84a5-867b7bcd4721.pdf
panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Failed to read prev xref: Unsupported /DecodeParms - only tested with /Columns 4 /Predictor 12
Producer:       Skia/PDF m81                                                                   
Pages:          4
PDF version:    1.5                                                                            

eca02c54-c700-4beb-9c8e-dd882f1a71da.pdf                                                       
panic: Failed to get page resources: Page 6 does not exist!!                                   
Producer:       Skia/PDF m80                                                                   
Pages:          34
PDF version:    1.4
phpdave11 commented 4 years ago

@rorycl this has been fixed in gofpdi v1.0.11. If you are able to submit PDFs that produce these errors, please open a new issue for each one and I can work on fixing the bugs: