GlenKPeterson / PdfLayoutManager

Adds line-breaking, page-breaking, tables, and styles to PDFBox
45 stars 20 forks source link

Long words do not wrap correctly in cells #11

Closed tylerbrazier closed 8 years ago

tylerbrazier commented 8 years ago

The example below demonstrates how using a long word in a cell causes the text to spill over into the next cell. See result.pdf. The text does wrap, but at the wrong point.

public class PdfTest {
    final static float margin = 40;

    public static void main(String[] args) throws Exception {
        PdfLayoutMgr pageMgr = PdfLayoutMgr.newRgbPageMgr();
        LogicalPage lp = pageMgr.logicalPageStart();
        float tableWidth = lp.pageWidth() - (2*margin);
        lp.putRow(margin, lp.yPageTop(), createRow(tableWidth));
        lp.commit();
        pageMgr.save(new FileOutputStream("result.pdf"));
    }

    private static Cell[] createRow(float tableWidth) {
        Cell[] result = new Cell[5];
        float cellWidth = tableWidth/result.length;
        CellStyle cellStyle = CellStyle.of(CellStyle.Align.MIDDLE_LEFT, Padding.NO_PADDING, null, BorderStyle.of(Color.BLACK));
        TextStyle textStyle = TextStyle.of(PDType1Font.COURIER, 9, Color.BLACK);

        for (int i=0; i<result.length; i++) {
            String text = (i == 0) ? "aabbccddeeffgghhiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz" : "";
            result[i] = Cell.of(cellStyle, cellWidth, textStyle, text);
        }
        return result;
    }
}
GlenKPeterson commented 8 years ago

Text generally only wraps at whitespace. The text wrapping algorithm picks a slightly long starting guess for where to wrap the text, then steps backward looking for whitespace. I think what you're seeing here is that it doesn't find any whitspace, so just truncates the first line at it's original guess length, adds a return, and continues the rest of the text on the next line.

In HTML, your example would not wrap at all.

I think the key take-away here is that if you want text wrapping to work, the text needs occasional whitespace. In theory, for a given language, one could enhance the text wrapping algorithm to follow the proper rules for that language - finding syllable breaks, or hyphens, or whatever. But that involves a lot of code for every language you support.

Sorry for the long wait. I was surprised that this text wrapped at all and it took me a while to really look into this and give you what is hopefully a good answer.