Multi_cell display inconsistency

hau-ifs commented 2 years ago

Hi team,

The library is awesome and has gotten me really far. Much appreciated.

I was just wondering if this is intended while using multi_cell and whether there is an alternative to what I am trying to achieve.

In some cases using multi_cell, the initial line appears to double the amount of spaces.

I have also been using multi-cell to truncate text, i.e. using multi-cell, selecting the first element, and replacing the last three characters with an ellipsis. However, this does not appear to work in cases where there are spaces or soft-hyphens as it appears to split on the most recent delimiter instead of the character that caused the text length to exceed the multi_cell width.

Is there another way to achieve the above?

from fpdf import FPDF
from fpdf.enums import XPos, YPos

pdf = FPDF()

example_text = "Lorem - Ipsum - DOLOR - DDDDD - hello"
example_text2 = "Lorem - Ipsum - DOLOR - DDDDDD - hello"
example_text3 = "LoremLoremLoremLoremLoremLoremLoremLoremLoremLorem"

pdf.add_page()

pdf.set_font(family='helvetica', style='B', size=10)
pdf.multi_cell(w=60, h=4, txt=example_text, border=0, new_x=XPos.LMARGIN, new_y=YPos.NEXT)
# Splitting on most recent delimiter and also doubling the spaces
pdf.cell(w=60, h=4, txt="", border=0, new_x=XPos.LMARGIN, new_y=YPos.NEXT)
pdf.multi_cell(w=60, h=4, txt=example_text2, border=0, new_x=XPos.LMARGIN, new_y=YPos.NEXT)

# Truncation method
pdf.cell(w=60, h=4, txt="", border=0, new_x=XPos.LMARGIN, new_y=YPos.NEXT)
pdf.multi_cell(w=60, h=4, txt=example_text3, border=0, new_x=XPos.LMARGIN, new_y=YPos.NEXT)
pdf.cell(w=60, h=4, txt="", border=0, new_x=XPos.LMARGIN, new_y=YPos.NEXT)
text_split = pdf.multi_cell(w=60, h=4, txt=example_text3, split_only=True, border=0, new_x=XPos.LMARGIN, new_y=YPos.NEXT)
pdf.cell(w=60, h=4, txt=f"{text_split[0][:-3]}...", border=0, new_x=XPos.LMARGIN, new_y=YPos.NEXT)

pdf.output("multi_cell.pdf")

gmischler commented 2 years ago

In some cases using multi_cell, the initial line appears to double the amount of spaces.

Thanks for reporting.

In your first issue, the longer text of your second example causes the line to wrap earlier. Since the default alignment is "justified", the word spacing thus necessarily becomes larger. If you don't want the word spacing to change depending on the amount of text per line, just use a different alignment.

In your second issue, I'm not entirely sure what you're asking for. Can you elaborate in a bit more detail?

hau-ifs commented 2 years ago

Ah, setting the alignment prevented the extra work spacing. Thanks for helping 😄 .

For the second issue, is it possible to split the text on the character regardless of the most recent space or soft-hyphen?

In the image posted above, the text "Lorem - Ipsum - DOLOR - DDDDDD - hello" when using split_only will return ["Lorem - Ipsum - DOLOR -", "DDDDDD - hello"]. Is it possible to return ["Lorem - Ipsum - DOLOR - DDDDD", "D - hello"] by sending appropriate arguments?

I suppose the better question is, given a text an a width, is it possible to determine which substring can fit within the width? I was using multi_cell as a proxy for this question.

gmischler commented 2 years ago

For the second issue, is it possible to split the text on the character regardless of the most recent space or soft-hyphen?

The optimal way would of course be to implement an option for the library to do character based line wrapping. I'm not sure what the best public API for that option would be though, or what other possibilities should be considered to round out the functionality.

Fpdf2 currently only does word based line wrapping, with the ability to split inside of a word only as a stop-gap measure when a word is longer than fits on a line.

As a quick workaround, you can simply substitute your spaces with non-breaking spaces "\u202f". This will make your string look to fpdf2 like a single word, which it will then split at the rightmost possible position on each line. You'll probably need to use a Unicode font for that to work without further processing.

given a text an a width, is it possible to determine which substring can fit within the width?

Same answer (using split_only=True). In that case, you can substitute for normal spaces again after splitting, which makes ist possible to use a non-Unicode font. Theoretically you could use FPDF.get_string_width() to sneak up on the right length yourself, but since that is essentially what the existing line wrapping code already does, it would be rather redundant.

hau-ifs commented 2 years ago

I was going to approach it using the last approach mentioned but I will have a play around. I'll close this question because I believe all my questions have been answered.

Thanks for the very detailed response.

py-pdf / fpdf2

Multi_cell display inconsistency #464