Open kumakichi opened 6 years ago
If I understand correctly, if there are too many characters, the conversion is not guaranteed. I believe that this is mostly due to the limitation of the hack of pdflib-lite in question. If we want to find a perfect solution, seemingly we need to switch to another lib (maybe the lib in engines like pdflatex
works?). But note that swf
is dead, I fear that nobody will be interested in writing such a program.
thanks for your fancy project, i'm not sure whether you can read Chinese or not(i got problems while using this tool converting some swf in Chinese),actually, i came across 2 problems:
version: gfx2gfx-pdf2text - part of swftools 0.9.2 (build 8d5a70b)
- some character convertion is not right demo file address
- fatal exception: [1106] PDF_encoding_set_char: Integer parameter 'slot' has bad value 256 demo file address
the 2nd problem is easy to fix(i don't know whether this fix is right or not), i just add 2 lines code after
if(gt7bits>=128) gt7bits=0;
but there still exists the 1st problem, some character is missing after convertion
I tried the code. And some characters are wrong, and reverted. And cannot copy text from pdf, how to set font for conversion? Please send to steve8000818@gmail.com if there is solution.
thanks for your fancy project, i'm not sure whether you can read Chinese or not(i got problems while using this tool converting some swf in Chinese),actually, i came across 2 problems: version: gfx2gfx-pdf2text - part of swftools 0.9.2 (build 8d5a70b)
- some character convertion is not right demo file address
- fatal exception: [1106] PDF_encoding_set_char: Integer parameter 'slot' has bad value 256 demo file address
the 2nd problem is easy to fix(i don't know whether this fix is right or not), i just add 2 lines code after https://github.com/RunasSudo/gfx2gfx-pdftext/blob/8d5a70b1d8526b7d596b9675a23386636a5a3b35/lib/devices/pdf.c#L392
if(gt7bits>=128) gt7bits=0;
but there still exists the 1st problem, some character is missing after convertion
I tried the code. And some characters are wrong, and reverted. And cannot copy text from pdf, how to set font for conversion? Please send to steve8000818@gmail.com if there is solution.
I believe that it is almost impossible to resolve. Essentially it is the restriction in pdflib-lite
, and the goal of this hacking seems temporarily solving the problem if there are not so many non-ASCII characters, and therefore it does not work properly for languages with a huge character set (like Chinese), gt7bits>=128
seems to be exactly the case when there are too many characters. You can see this from the comments in the code.
I opened an issue at https://github.com/matthiaskramm/swftools/issues/68. However, I am not sure whether there is somebody interested in writing codes for this.
thanks for your fancy project, i'm not sure whether you can read Chinese or not(i got problems while using this tool converting some swf in Chinese),actually, i came across 2 problems: version: gfx2gfx-pdf2text - part of swftools 0.9.2 (build 8d5a70b)
- some character convertion is not right demo file address
- fatal exception: [1106] PDF_encoding_set_char: Integer parameter 'slot' has bad value 256 demo file address
the 2nd problem is easy to fix(i don't know whether this fix is right or not), i just add 2 lines code after https://github.com/RunasSudo/gfx2gfx-pdftext/blob/8d5a70b1d8526b7d596b9675a23386636a5a3b35/lib/devices/pdf.c#L392
if(gt7bits>=128) gt7bits=0;
but there still exists the 1st problem, some character is missing after convertion
I tried the code. And some characters are wrong, and reverted. And cannot copy text from pdf, how to set font for conversion? Please send to steve8000818@gmail.com if there is solution.
I believe that it is almost impossible to resolve. Essentially it is the restriction in
pdflib-lite
, and the goal of this hacking seems temporarily solving the problem if there are not so many non-ASCII characters, and therefore it does not work properly for languages with a huge character set (like Chinese),gt7bits>=128
seems to be exactly the case when there are too many characters. You can see this from the comments in the code.I opened an issue at matthiaskramm/swftools#68. However, I am not sure whether there is somebody interested in writing codes for this.
Do you have any methods to export all resources and coordinates for each letter/character?
Is this the reason for the comment // cross our fingers and hope there aren't more than 256 glyphs
in lib/devices/pdf.c?
I don't understand C, but it seems like a dummy font FreeSerif
is being made for holding glyphs. What about making multiple dummy fonts, each one containing a maximum of 256 glyphs?
Is this the reason for the comment
// cross our fingers and hope there aren't more than 256 glyphs
in lib/devices/pdf.c?I don't understand C, but it seems like a dummy font
FreeSerif
is being made for holding glyphs. What about making multiple dummy fonts, each one containing a maximum of 256 glyphs?
IMHO, I don't find such kind of thing reasonable. PDFlib-Lite
was dead in 2011, which was a subset of the proprietary PDFlib
still selling today. I think that a rewrite based on a FOSS pdf library seems more attractive.
thanks for your fancy project, i'm not sure whether you can read Chinese or not(i got problems while using this tool converting some swf in Chinese),actually, i came across 2 problems: version: gfx2gfx-pdf2text - part of swftools 0.9.2 (build 8d5a70b)
- some character convertion is not right demo file address
- fatal exception: [1106] PDF_encoding_set_char: Integer parameter 'slot' has bad value 256 demo file address
the 2nd problem is easy to fix(i don't know whether this fix is right or not), i just add 2 lines code after https://github.com/RunasSudo/gfx2gfx-pdftext/blob/8d5a70b1d8526b7d596b9675a23386636a5a3b35/lib/devices/pdf.c#L392
if(gt7bits>=128) gt7bits=0;
but there still exists the 1st problem, some character is missing after convertion
I tried the code. And some characters are wrong, and reverted. And cannot copy text from pdf, how to set font for conversion? Please send to steve8000818@gmail.com if there is solution.
I believe that it is almost impossible to resolve. Essentially it is the restriction in
pdflib-lite
, and the goal of this hacking seems temporarily solving the problem if there are not so many non-ASCII characters, and therefore it does not work properly for languages with a huge character set (like Chinese),gt7bits>=128
seems to be exactly the case when there are too many characters. You can see this from the comments in the code. I opened an issue at matthiaskramm/swftools#68. However, I am not sure whether there is somebody interested in writing codes for this.Do you have any methods to export all resources and coordinates for each letter/character?
I succeeded to do that, if I remember correctly. Maybe some utility in SWFTools
works. See also https://reverseengineering.stackexchange.com/questions/133/how-does-one-reverse-engineer-a-swf-file
thanks for your fancy project, i'm not sure whether you can read Chinese or not(i got problems while using this tool converting some swf in Chinese),actually, i came across 2 problems:
version: gfx2gfx-pdf2text - part of swftools 0.9.2 (build 8d5a70b)
the 2nd problem is easy to fix(i don't know whether this fix is right or not), i just add 2 lines code after https://github.com/RunasSudo/gfx2gfx-pdftext/blob/8d5a70b1d8526b7d596b9675a23386636a5a3b35/lib/devices/pdf.c#L392
but there still exists the 1st problem, some character is missing after convertion