Open GoogleCodeExporter opened 9 years ago
Hello,
I have the same problem. Have you found any solution on this? This makes the
library almost useless for eastern countries of Europe.
Original comment by vant...@gmail.com
on 3 Apr 2012 at 6:59
Hi,
I have found workaround for this problem. If i replaced regular space with
nonbreaking space \u00a0, and set chunk.setSplitCharacter(new
NonBreakingSplitharacter()) then justification worked correctly.
I'm using following script to change space characters in purePdf chunk:
var chunk:Chunk; //purePdf chunk
var text:String; //text to be added to pdf document
var tmpArray:Array;
tmpArray = text.split(" ");
text = tmpArray.join("\u00a0");
chunk = new Chunk(text, getFont());
chunk.setSplitCharacter(new NonBreakingSplitCharacter());
Original comment by aivar.p...@gmail.com
on 3 Apr 2012 at 7:30
Hello,
great thanks for your solution. I've updated it a little bit with new class,
which extends DefaultSplitCharacter, and looks like this:
package pdfGenerator {
import org.purepdf.ISplitCharacter;
import org.purepdf.pdf.DefaultSplitCharacter;
import org.purepdf.pdf.PdfChunk;
/**
* ...
* @author Marcin Wantuch
*/
public class MySplitCharacter extends DefaultSplitCharacter implements ISplitCharacter {
public function MySplitCharacter() {
super();
}
/**
* if the current character is split character or not
* @param start - ???
* @param current - current position in the array
* @param cc - the character array that has to be checked
* @param ck - chunk array
* @return true, if this is split character
*/
override public function isSplitCharacter( start: int, current: int, end: int, cc: Vector.<int>, ck: Vector.<PdfChunk> ): Boolean {
var c: int = getCurrentCharacter( current, cc, ck );
if ( c <= ' '.charCodeAt(0) || c == '-'.charCodeAt(0) || c == 8208 /*'\u2010'*/ || c == 160 /*'\u00a0'*/ ) {
return true;
}
if( c < 0x2002 ) {
return false;
}
return ( (c >= 0x2002 && c <= 0x200b)
|| (c >= 0x2e80 && c < 0xd7a0)
|| (c >= 0xf900 && c < 0xfb00)
|| (c >= 0xfe30 && c < 0xfe50)
|| (c >= 0xff61 && c < 0xffa0) );
}
}
}
and setting this class as default split character. The only change is to put "c
== 160 /*'\u00a0'*/" at the end of if statement. It's a little bit better,
because it doesn't begin the new line with comas, dots, quotes etc. (the 160 is
a decimal equivalent to 00a0). So great thanks for your idea.
I have a next question, because let's have now pdf generated with unicode
characters. Have you tried to copy text included in this pdf and viewed for
example by Adobe Reader? When I copy it somewhere, I have different characters
from this in the text, so I can't "Search" the pdf by the words. Is there any
solution to fix it?
Original comment by vant...@gmail.com
on 5 Apr 2012 at 6:25
You can check it by clicking: http://89.234.211.20/_marcinw/test.pdf Just try
to copy and paste selected text from pdf. How to fix it? The font base encoding
is IDENTITY_H and must be like that.
Original comment by vant...@o2.pl
on 18 May 2012 at 1:02
Original issue reported on code.google.com by
aivar.p...@gmail.com
on 29 Feb 2012 at 3:43