gonum / plot

A repository for plotting and visualizing data
BSD 3-Clause "New" or "Revised" License
2.73k stars 202 forks source link

plotter: Is current "uniformness" of heatmap coloring intentional? #777

Open seiyab opened 5 months ago

seiyab commented 5 months ago

What are you trying to do?

Drawing heatmap.

What did you do?

func TestHeatMap(t *testing.T) {
    m := offsetUnitGrid{
        XOffset: -2,
        YOffset: -1,
        Data: mat.NewDense(3, 4, []float64{
            1, 2, 3, 4,
            5, 6, 7, 8,
            9, 10, 11, 12,
        }),
    }
    pal := myPalette{}
    h := plotter.NewHeatMap(m, pal)

    p := plot.New()
    p.Add(h)

    p.X.Padding = 0
    p.Y.Padding = 0

    img := vgimg.New(250, 175)
    dc := draw.New(img)
    p.Draw(dc)
    w, err := os.Create("golden_files/heatMap.png")
    if err != nil {
        log.Panic(err)
    }
    png := vgimg.PngCanvas{Canvas: img}
    if _, err = png.WriteTo(w); err != nil {
        t.Fatal(err)
    }
}

type myPalette struct{}

func (myPalette) Colors() []color.Color {
    return []color.Color{
        color.RGBA{R: 255, A: 255},
        color.RGBA{G: 255, A: 255},
        color.RGBA{B: 255, A: 255},
        color.RGBA{R: 255, G: 255, A: 255},
    }
}

// offsetUnitGrid is copied from https://github.com/gonum/plot/blob/b4fdc267610216647ec69681d47e431cf5bbed23/plotter/heat_test.go#L21

What did you expect to happen?

Each color is used "uniformly". I consider 3 cells for each colors (3 * 4 = 12) is "unform". Following code comments says palette is scaled uniformly across the data range. https://github.com/gonum/plot/blob/b4fdc267610216647ec69681d47e431cf5bbed23/plotter/heat.go#L157

What actually happened?

2 cells for red, 4 cells for green, 4 cells for blue, 2 cells for yellow. This doesn't look uniform to me. heatMap

Probably first color and last color have half chance to be used compared the other colors. In my thought, following should be ps := float64(len(pal)) / (h.Max - h.Min) https://github.com/gonum/plot/blob/b4fdc267610216647ec69681d47e431cf5bbed23/plotter/heat.go#L157-L158 and following shouldbe col = pal[int((v-h.Min)*ps)] to get uniform. https://github.com/gonum/plot/blob/b4fdc267610216647ec69681d47e431cf5bbed23/plotter/heat.go#L222-L224

What version of Go and Gonum/plot are you using?

go version go1.22.0 darwin/arm64
gonum.org/v1/plot v0.14.0

Does this issue reproduce with the current master?

Not experimented yet. Maybe yes.

kortschak commented 5 months ago

End points of a line are special, as are all boundary conditions. This is further complicated by the fact that we're calculating a value for the midpoint of a box (this is where the 0.5 comes from). But all things considered, this is aesthetics, if you don't like how it is and you think that a different approach should be used, please demonstrate the alternative graphically (preferably with something that is closer to continuous than only four colours).

seiyab commented 5 months ago

Probably I understand. You mean midpoint of a box is uniformly distributed, right?

h.Min       h.Max
v           v
<----------->
^   ^   ^   ^
|   |   |   midpoint of yellow
|   |   midpoint of blue
|   midpoint of green
midpoint of red

What I expected is each color occupy same range. Say, 0-3 for red, 3-6 for green, 6-9 for blue, 9-12 for yellow in example above.

seiyab commented 5 months ago

Oops I was confused somewhat. In the example, data range is [1, 12], not [0, 12]. So my explanation is not valid.

Anyway, my intuition is, each color occupies same size of range. For example, when we have data [0, 4] and have 4 colors, I expect first color for [0, 1), second color for [1, 2), third color for [2, 3), fourth color for [3, 4].

kortschak commented 5 months ago

The ascii art you have is correct. But really, as I said, it's an aesthetic decision. If you think you have a better implementation, please show it and we can consider making the change. As it is, the intention is that gradation should be finer and the boundary behaviour diminishes to zero, ... and heat maps aren't really intended to show quantitative data in a way that is sensitive to this kind of issue.

seiyab commented 5 months ago

I edited the description with implementation I consider.

In my thought, following should be ps := float64(len(pal)) / (h.Max - h.Min) https://github.com/gonum/plot/blob/b4fdc267610216647ec69681d47e431cf5bbed23/plotter/heat.go#L157-L158 and following shouldbe col = pal[int((v-h.Min)*ps)] to get uniform. https://github.com/gonum/plot/blob/b4fdc267610216647ec69681d47e431cf5bbed23/plotter/heat.go#L222-L224

I'm not eager to change the behaviour nor add new option. In this issue, I mainly just wanted to get sure it is intentional to you (of course, wanted to "fix" if it's unintentional).

Now I understand the current behaviour is intentional and it's an aesthetic decision.

Honestly, I need "uniformly" distributed one for ranges, not for midpoint. However, we already have customized version of heat map in our repository with another reason so I'll just add this behaviour into our customized version.

This issue can be closed, or we can keep this opened so that someone who need another behaviour can discuss.