snargleplax commented 4 years ago

Synopsis

As of 26853c1aa3805707c943e8cbb499e23763498014 (according to git bisect), MultiCell's behavior changed in a way that was not documented in the intent of PR #291 or referenced issue https://github.com/jung-kurt/gofpdf/issues/276.

Repro

package main

import (
    "io/ioutil"
    "log"
    "strings"

    "github.com/jung-kurt/gofpdf"
)

func main() {
    f := gofpdf.New("P", "in", "Letter", "")
    f.SetMargins(margin, margin, -margin)
    f.SetFillColor(0xb6, 0xd0, 0xd3)
    f.AddPage()

    f.SetFont("Arial", "", 12)

    _, h := f.GetFontSize()
    h *= 1.3
    x, y := f.GetXY()
    kChunk := [][]string{[]string{"foo", "", ""}}
    vChunk := [][]string{[]string{"bar", "bar", "bar"}}

    for i, chK := range kChunk {
        chV := vChunk[i]
        if i > 0 {
            f.AddPage()
        }
        y = f.GetY()

        more := i < len(kChunk)-1
        var borderStr string
        switch {
        case i == 0 && more:
            borderStr = "TLR"
        case i > 0:
            if more {
                borderStr = "LR"
            } else {
                borderStr = "BLR"
            }
        default:
            borderStr = "1"
        }

        f.MultiCell(usableWidth/2, h, strings.Join(chK, "\n")+"\n", borderStr, "LM", true)
        f.SetXY(x+usableWidth/2, y)
        f.MultiCell(usableWidth/2, h, strings.Join(chV, "\n")+"\n", borderStr, "LM", true)
    }

    tf, err := ioutil.TempFile("", "gofpdf-repro")
    if err != nil {
        log.Fatalln("open temp file:", err)
    }

    if err := f.Output(tf); err != nil {
        log.Fatalln("write file output:", err)
    }

    log.Println("wrote ", tf.Name())
}

const (
    margin      = 1
    letterWidth = 8.5
    usableWidth = letterWidth - (margin * 2)
)

Expected results

Output should be same as before referenced commit:

Actual results

Analysis

Looks to me like the changes to the slice logic around lines 2563 and 2570 is probably the culprit. Seems to be trimming the slice in more cases than before, which may be right for Chinese but it also breaks my app.

jung-kurt commented 4 years ago

Thanks for the thorough investigation and test app. Sorry for the regression.

Looks to me like the changes to the slice logic around lines 2563 and 2570 is probably the culprit.

I agree. I'll focus on those operations.

jung-kurt commented 4 years ago

I believe the problem is the way MultiCell handles trailing newlines. In particular, the old code preserved all but the last trailing newline. The revised code strips them all. Your test code (thanks for including!) runs the way you expect with the former code and breaks with the subsequent code.

I can revert the non-utf-8 case (keep all but last trailing newline) and add a method (something like .SetMultiCellTrailingNewlineMode()) to allow non-utf-8 behavior to match utf-8 behavior. Do you have any suggestions as to the best way out of this mess?

snargleplax commented 4 years ago

I've been looking it over and don't understand why the revision needed to change the blank line trimming at all. What happens if you revert both cases, and just leave the isChinese checks in place?

jung-kurt commented 4 years ago

As the author of this change back in August, do you have any comments, @hyzgh?

hyzgh commented 4 years ago

@jung-kurt @snargleplax I'm sorry for it. When I wrote this code, I thought what original code wants to do is to remove extra line breaks, not matter UTF-8 or not-UTF-8. As you could see, the original code removes extra line breaks for UTF-8 but doesn't remove extra line breaks for not-UTF-8. The original code behaves differently for different fonts, which I think is a bug.

hyzgh commented 4 years ago

I think that keeping UTF-8 and not-UTF-8 behave the same and adding a new method something like SetMultiCellTrailingNewlineMode will be a good idea.

jung-kurt commented 4 years ago

The original code behaves differently for different fonts, which I think is a bug.

I agree, @hyzgh.

The UTF-8 code is only about 6 months old, while the non-UTF-8 code is 6 years. In order to make both font types work similarly I think it will be less disruptive to remove, in both cases, only the last newline if it is present. I can then do one of three things:

add SplitLinesExt() and MultiCellExt() that accepts a new line split structure that has a flag that indicates that all trailing newlines be stripped; this structure could be augmented with other flags in the future
use the same split line structure but accept it as an optional variadic argument at the end of MultiCell() and SplitLines(); I am not sure if this could subtly break things however
simply document the need to call strings.TrimRight() if needed before calling MultCell() or SplitLines().

Any thoughts?

snargleplax commented 4 years ago

I'd favor the third option for being parsimonious and limiting the scope of what gofpdf needs to be responsible for -- that's what I'd expect by default from a Go library, in the spirit of making small, composable components.

However, that change seems like it'd be a regression for existing code that doesn't have that TrimRight call. To avoid breaking anyone at all, but still fix the inconsistency and the font bug, I think you'd have to version the module.

If you don't want to version the module (or break anybody), I'd go with the Ext variants (which at least have some precedent), and revert the newline trimming bits from https://github.com/jung-kurt/gofpdf/pull/291. You could potentially deprecate them at birth and remove them once you've versioned the module and introduced the TrimRight semantics change to the non-Ext ones.

The variadic argument approach would most likely work, but I see a couple issues that make it less good. First, technically it's a breaking change because someone could have assigned it to a variable of function type (see this bit of this talk). Could be the case that nobody's done that, but it's possible, and they'd get a compilation failure. Second thing is just that using a variadic param for a zero-or-one case as opposed to zero-to-n seems like poor interface design -- what happens when someone passes in multiple values? So, I'd avoid that.

jung-kurt commented 4 years ago

Good points about using the variadic kluge.

I have reinstated the previous behavior for non-UTF-8 fonts. See MultiCell() for my note on this.

snargleplax commented 4 years ago

Awesome, that looks great. Thanks for the quick attention @jung-kurt and @hyzgh.

jung-kurt / gofpdf

Empty lines inappropriately trimmed by MultiCell (regression) #333

Synopsis

Repro

Expected results

Actual results

Analysis