gdtiti / alivepdf

Automatically exported from code.google.com/p/alivepdf
0 stars 0 forks source link

Multiple font settings within one <P></P> are not processed properly. #205

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a RichTextEditor in an AIR application
2. Type in texts as shown in attached editor.png
3. call PDF.writeFlashHtmlText();
4. output varies from libraries of 1.4.9 and 1.5RC, and can be found in
attached PDFGenerator_1.4.9.pdf and PDFGenerator_1.5RC.pdf

What is the expected output? What do you see instead?
Texts looks identical to that shown in editor is expected in the output PDF
file.
For the actual output, please refer to the two attached PDF files.

What version of the product are you using? On what operating system?
AlivePDF 1.4.9 & 1.5RC
Windows 7

Please provide any additional information below.
As it is obvious that the PDF produced by 1.5RC just ignored font size
setting specified in the RichText Formatted string.
This is because a variable "fs" is not used after retrieving the font size
data.
I have posted an issue w.r.t. this (ID 201)
However, as you can see, even in 1.4.9, font size is not handled properly
either.
For example, if there are multiple font size specified within a paragraph
tag <P></P>, just like that shown in the attached PDFGenerator_1.4.9.pdf.

As I can't find any specification on adobe's website, I tried playing
around and conceived a general DTD that could roughly describe FLEX's
RichText Formatted string, which uses an XML structure. As shown below:

<!ELEMENT P (FONT)>
<!ELEMENT FONT (#PCDATA | FONT | B)*>
<!ELEMENT B (#PCDATA | I)*>
<!ELEMENT I (#PCDATA | U)*>
<!ELEMENT U (#PCDATA)>

<!ATTLIST FONT
FACE CDATA #REQUIRED
SIZE CDATA #REQUIRED
COLOR CDATA #COLOR "#000000"
LETTERSPACING CDATA #REQUIRED
KERNING CDATA #REQUIRED
>

According to the DTD (actually according to the RichTextEditor.htmlText),
for each <P/>, font type, font size and font colour are set within a
<FONT/> while Bold, Italic and Underlined have their own tag structures.
(Bold, Italic and Underlined are handled pretty well using a switch-liked
boolean variable in the original source.)
Since there is one and only one <FONT/> in it, any multiple fonts setting
would mean at least one level of nested <FONT/> in this <P/>.

In the example shown in attached editor.png, the first line (the first
<P/>) has two different font settings. One is of font size 12 (smaller one)
and one is of 24 (bigger one). This results in a nested structure of font
tags. A simplified pseudo structure is given as followed:

<P ALIGN="LEFT">
    <FONT FACE="Arial" SIZE="12" COLOR="#000000" LETTERSPACING="0" KERNING="0">
        AA
        <FONT SIZE="24">AA</FONT>
        AAA
    </FONT>
</P>

We can see that the nested <FONT/> is actually based on the outer one, as
it does not have other attributes than "SIZE". It inherits other attributes
from the outer <FONT/>.

The PDF.writeFlashHtmlText() would firstly call PDF.parseTags() method,
which flattens the tree structure of this XML object, and give an array
like this:
INDEX    :    TYPE    :    VALUE
0    :    HTMLTag    :    <P>
1    :    HTMLTag    :    <FONT FACE="Arial" SIZE="12">
2    :    HTMLTag    :    AA
3    :    HTMLTag    :    <FONT SIZE="24">
4    :    HTMLTag    :    AA
5    :    HTMLTag    :    </FONT>
6    :    HTMLTag    :    AAA
7    :    HTMLTag    :    </FONT>
8    :    HTMLTag    :    </P>

Following procedure explains the reason that font size is not working
properly in 1.4.9 is (similar reason for 1.5RC, apart from that it never
uses the variable "fs").
This array will then be processed according its linear structure, from 0 to 8.
The program will set the output font size (namely, the variable "fs") to 12
when it meets the first <FONT> tag, therefore the subsequent text "AA" will
be rendered with font size 12.
Moving on to the second <FONT> tag, the program will set the output font
size ("fs") to 24, so similarly, the text "AA" with an index of 4 will be
rendered with font size 24.
After the </FONT>, the next object in the array is text "AAA". HOWEVER, the
output font size ("fs") is still set to be 24. Which results in "AAA" is
rendered with a font size of 24, as shown in the attached
PDFGenerator_1.4.9.pdf.
The reason of this blunder is the program FORGETS what the font size of the
outer <FONT> is.

As apparently this library is far from finished, I didn't look into the
org.alivepdf.fonts package to see all details of its font structure. But I
think the way it process the <FONT/> is not correct. That's why font
settings are not well handled.
I therefore created another class named MasterFont (attached) to hold font
setting information based on <FONT> tags during the traversal of the
HTMLTag array. And a stack will also be used to store all MasterFont
objects created.

The detailed procedure is:
Whenever an HTMLTag representing a <FONT>, which is also a start tag, is
encountered, a MasterFont is created either based on the current MasterFont
(if there's any) or from scratch (using those attributes specified inside
the <FONT> tag).
Afterwards, push this newly created MasterFont object to the stack.
In case it is created based on the current MasterFont, it firstly
initialise the new MasterFont object using the current on (on top of the
stack), and then assign those new attribute values to it according to the
current <FONT> tag.
Whenever an HTMLTag representing a </FONT>, which is an end tag, is
encountered, the top MasterFont object in the stack is popped.
Maintaining this stack of MasterFont objects, whenever a text data is met
during the traversal of the HTMLTag array, the program can process it using
correct font settings stored in the MasterFont object at the top of the stack.

With this change in the source code of 1.5RC, PDF.writeFlashHtmlText() can
handle multiple font size and font colour within one <P/>.
This FACE attribute of <FONT/> is not processed in either 1.4.9 or 1.5RC,
as there is a TODO comment in the "switch" statement and without any
implementation.

Original issue reported on code.google.com by stephen....@gmail.com on 3 Mar 2010 at 3:31

Attachments:

GoogleCodeExporter commented 9 years ago
Font face could be implemented in a similar way actually.

But according to the current 1.5RC, Arial is mapped to Helvetica.
And also, the FontFamily is not well compatible with the font specification 
used by
FLEX's RichTextEditor.
For example, "Times New Roman" is not accepted by 1.5RC's FontFamily. It only 
accepts
"Times".

There are some other trivial problems like this.

Original comment by stephen....@gmail.com on 3 Mar 2010 at 4:31

GoogleCodeExporter commented 9 years ago
Can we get the  MasterFont.as please..cant see it attached

Original comment by Jem...@gmail.com on 24 Aug 2011 at 5:28

GoogleCodeExporter commented 9 years ago
Hello there,

Sorry, I don't have the file you want.
And I could barely remember what I did at that time ...

Do you really think this project is still active?

Original comment by stephen....@gmail.com on 30 Aug 2011 at 2:13

GoogleCodeExporter commented 9 years ago
 - http://code.google.com/p/alivepdf/issues/detail?id=205 => fixed in r281. Now Fontsize are correctly processed. Please refer to the "writeFlashHtmlText how-to" in http://code.google.com/p/alivepdf/wiki/APIAdditions . I'm afraid there is no improvement regarding the fontFace (for now).

Original comment by felix.ge...@gmail.com on 8 Oct 2011 at 8:44