I was taking a look at your software and I must say it's very good.
So I decided to give it a try...
I have one of these PDFs that contains a vector image inside.
I managed to extract the specific stream for the vector content:
This is what I got:
q
1 0 0 1 340.9799957 298.8000031 cm
1 g
0 0 m
20.04 0 l
20.04 -11.46 l
0 -11.46 l
0 0 l
h
f*
Q
BT
/C2_0 10.121 Tf
-0.175 Tc 342.06 289.56 Td
<0004000500060004>Tj
ET
(...)
I'm able to render it.
But I'm having some difficulties rendering the text.
The hex string does not seem to be a valid string.
My guess is that it's and index to the font's code page, in this case the font referred by /C2_0
0004000500060004 => 0004 0005 0006 0004
Depending on the representation I'm assuming 2 bytes per code.
I don't know where to check that information, (I know simple font sizes only take one byte)
The question is how can I have access to the font and respective code page information to extract the text.
Or better yet if there's a simpler way to get all of this without me having to parse the vector data myself.
Getting the Objects directly... For example PdfLine, PdfText, PdfCircle, etc...
I was taking a look at your software and I must say it's very good. So I decided to give it a try...
I have one of these PDFs that contains a vector image inside. I managed to extract the specific stream for the vector content:
This is what I got:
I'm able to render it. But I'm having some difficulties rendering the text.
For example:
The hex string does not seem to be a valid string. My guess is that it's and index to the font's code page, in this case the font referred by
/C2_0
0004000500060004
=>0004 0005 0006 0004
Depending on the representation I'm assuming 2 bytes per code. I don't know where to check that information, (I know simple font sizes only take one byte)The question is how can I have access to the font and respective code page information to extract the text.
Or better yet if there's a simpler way to get all of this without me having to parse the vector data myself. Getting the Objects directly... For example
PdfLine
,PdfText
,PdfCircle
, etc...Thanks.