Open gravit22 opened 1 year ago
Can you share the pdf?
Can you share the pdf?
Sure. newithkuil_affixes.pdf
[e.g., canyon → rift valley]
is the problematic area. The ft
ligature is causing trouble. It seems most other pdf readers convert this to 0
. i.e. ri0 valley
. It'd be nice to do better.
Thank you for figuring it out. How can it be fixed?
I'd like to better understand how the other programs are coming up with '0' for that glyph
The crate is panicking when I try to extract text from pdf:
thread 'main' panicked at 'missing char 48 in map {40: "R", 7: "i", 31: "v", 59: "-", 57: "]", 18: "C", 26: "P", 28: "H", 4: "w", 37: "f", 5: "e", 51: "A", 50: "q", 43: "x", 46: "”", 25: "k", 27: ".", 60: "O", 34: "/", 52: "(", 17: "h", 11: "p", 30: "B", 10: " ", 2: "l", 24: "’", 14: "s", 35: "S", 3: "o", 29: "!", 45: "“", 44: "W", 41: "V", 15: "d", 19: "m", 47: "→", 13: "t", 20: "b", 53: ")", 16: "u", 12: "a", 58: "G", 9: "g", 38: "z", 55: ";", 56: "[", 1: "F", 39: "E", 42: "D", 49: "‘", 54: "J", 36: "I", 6: "r", 8: "n", 21: "c", 23: "y", 32: "j", 33: ",", 22: "T"}', /home/mykhailo/.cargo/registry/src/github.com-1ecc6299db9ec823/pdf-extract-0.6.4/src/lib.rs:733:27