Closed Kleissner closed 2 years ago
Current version (prep-rc-v3.16.0-take2) crashes when extracting text.
panic: runtime error: index out of range [-1] goroutine 61 [running]: github.com/unidoc/unipdf/internal/cmap.(*CMap).parseBfrange(0xc016190f70, 0x1ec9380, 0xc009b57f70) C:/Temp/Go/src/github.com/unidoc/unipdf/internal/cmap/cmap.go:12 +0x99e github.com/unidoc/unipdf/internal/cmap.(*CMap).parse(0xc016190f70, 0x67c, 0xe00) C:/Temp/Go/src/github.com/unidoc/unipdf/internal/cmap/cmap.go:12 +0x5b4 github.com/unidoc/unipdf/internal/cmap.LoadCmapFromData(0xc009b68000, 0x67c, 0xe00, 0x1, 0x0, 0x0, 0x2265320) C:/Temp/Go/src/github.com/unidoc/unipdf/internal/cmap/cmap.go:12 +0x1a8 github.com/unidoc/unipdf/model._cfba(0x2265320, 0xc008e94900, 0xc003a03080, 0xc008e94900, 0xc008e98778, 0x3633373838353301) C:/Temp/Go/src/github.com/unidoc/unipdf/model/model.go:2128 +0x149 github.com/unidoc/unipdf/model._ecggg(0x22650e0, 0xc008e9ecc0, 0x0, 0x0, 0x0, 0x0) C:/Temp/Go/src/github.com/unidoc/unipdf/model/model.go:2238 +0x9ff github.com/unidoc/unipdf/model._aabe(0x22650e0, 0xc008e9ecc0, 0x1, 0x22650e0, 0xc008e9ecc0, 0x0) C:/Temp/Go/src/github.com/unidoc/unipdf/model/model.go:1626 +0x5f github.com/unidoc/unipdf/model.NewPdfFontFromPdfObject(...) C:/Temp/Go/src/github.com/unidoc/unipdf/model/model.go:1175 github.com/unidoc/unipdf/extractor.(*textObject).getFontDirect(0xc00208d760, 0xc0151f3060, 0x4, 0x4, 0x2cd3600, 0xc000101000) C:/Temp/Go/src/github.com/unidoc/unipdf/extractor/extractor.go:16 +0x7c github.com/unidoc/unipdf/extractor.(*textObject).getFont(0xc00208d760, 0xc0151f3060, 0x4, 0x0, 0x0, 0x3ff0000000000000) C:/Temp/Go/src/github.com/unidoc/unipdf/extractor/extractor.go:106 +0x7c github.com/unidoc/unipdf/extractor.(*textObject).setFont(0xc00208d760, 0xc0151f3060, 0x4, 0x3ff0000000000000, 0x0, 0x0) C:/Temp/Go/src/github.com/unidoc/unipdf/extractor/extractor.go:210 +0x68 github.com/unidoc/unipdf/extractor.(*Extractor).extractPageText.func1(0xc0099d8540, 0x227bf20, 0x2cd0d78, 0x227bf20, 0x2cd0d78, 0x1ec0e00, 0xc00bd22bb8, 0x1ec0e00, 0xc00bd22bc0, 0x3ff0000000000000, ...) C:/Temp/Go/src/github.com/unidoc/unipdf/extractor/extractor.go:141 +0x29ec github.com/unidoc/unipdf/contentstream.(*ContentStreamProcessor).Process(0xc006dc15b8, 0xc0098d1170, 0x0, 0x0) C:/Temp/Go/src/github.com/unidoc/unipdf/contentstream/contentstream.go:321 +0x54b github.com/unidoc/unipdf/extractor.(*Extractor).extractPageText(0xc003a02fc0, 0xc009a02000, 0x1c43, 0xc0098d1170, 0x3ff0000000000000, 0x0, 0x0, 0x0, 0x3ff0000000000000, 0x0, ...) C:/Temp/Go/src/github.com/unidoc/unipdf/extractor/extractor.go:141 +0xa73 github.com/unidoc/unipdf/extractor.(*Extractor).ExtractPageText(0xc003a02fc0, 0x11cefbf, 0x30, 0x1f77f00, 0xc006dc1701, 0xc0099d8060) C:/Temp/Go/src/github.com/unidoc/unipdf/extractor/extractor.go:90 +0xf0 github.com/unidoc/unipdf/extractor.(*Extractor).ExtractTextWithStats(0xc003a02fc0, 0xc003a02fc0, 0x0, 0x0, 0x0, 0xc0086261a0, 0x0) C:/Temp/Go/src/github.com/unidoc/unipdf/extractor/extractor.go:220 +0x47 github.com/unidoc/unipdf/extractor.(*Extractor).ExtractText(...) C:/Temp/Go/src/github.com/unidoc/unipdf/extractor/extractor.go:131
No crash. PDF may or may not be corrupted, but it shouldn't crash either way.
This is the file causing the crash when extracting text: e45f8ebb-bb7d-415e-8ae5-ab8c0ea56552.pdf
Seems like this has been fixed already. Cannot reproduce in latest version.
Description
Current version (prep-rc-v3.16.0-take2) crashes when extracting text.
Expected Behavior
No crash. PDF may or may not be corrupted, but it shouldn't crash either way.
Attachments
This is the file causing the crash when extracting text: e45f8ebb-bb7d-415e-8ae5-ab8c0ea56552.pdf