Open pblesi opened 10 years ago
Did you find a solution for this? I believe I'm facing a similar issue.
I suspect this is an issue with our text layout algorithms in the PageLayout
class.
Unfortunately I'm short on time at the moment, but I'll happily accept patches if you want to investigate further,
reader.pages.at(3).text produces this output:
• FAX/Scanner/Copiers • 2 Digital Cameras • 1 Cisco Router • Hub
however text contained when pdf is rendered is:
4 FAX/Scanner/Copiers 2 Digital Cameras 1 Cisco Router 1 Hub
As you can see the numbers for 2 of the elements in the list are missing.
It appears I cannot include the pdf file, but the raw content for this page is:
/C2_0 1 Tf 0 Tc 0 Tw 12 0 0 12 97.2 186.9 Tm
<0078>Tj /TT2 1 Tf -0.0004 Tc 0.0026 Tw 0.46 0 Td [( )-760(2 Poly Com systems )]TJ ET EMC /P <>BDC BT /C2_0 1 Tf 0 Tc 0 Tw 12 0 0 12 97.2 172.26 Tm <0078>Tj /TT2 1 Tf -0.0002 Tc 0.7624 Tw 0.46 0 Td [( 4 )760(FAX/Scanner/Copiers )]TJ ET EMC /P <>BDC BT /C2_0 1 Tf 0 Tc 0 Tw 12 0 0 12 97.2 157.68 Tm <0078>Tj /TT2 1 Tf -0.0002 Tc 0.0024 Tw 0.46 0 Td [( )-760(2 Digita)-4(l)2( Cameras )]TJ ET EMC /P <>BDC BT /C2_0 1 Tf 0 Tc 0 Tw 12 0 0 12 97.2 143.04 Tm <0078>Tj /TT2 1 Tf -0.0002 Tc 0.0024 Tw 0.46 0 Td [( )-760(1 Cisco Router )]TJ ET EMC /P <>BDC BT /C2_0 1 Tf 0 Tc 0 Tw 12 0 0 12 97.2 128.46 Tm <0078>Tj /TT2 1 Tf -0.0014 Tc 0.7636 Tw 0.46 0 Td [( 1 )760(Hub )]TJ ET EMC /P <>BDC BT /C2_0 1 Tf 0 Tc 0 Tw 12 0 0 12 97.2 113.82 Tm <0078>Tj /TT2 1 Tf -0.0004 Tc 0.0026 Tw 0.46 0 Td [( )-760(6 NEC projectors mounted on portable carts )]TJ ET EMC