microsoft / datamations

https://microsoft.github.io/datamations/
Other
66 stars 14 forks source link

Prototype how funky groupings could look better #105

Closed jhofman closed 2 years ago

jhofman commented 2 years ago

Here are some test specs and their videos:

100 points to 50/50

https://user-images.githubusercontent.com/15895337/138937144-ba792f6b-9353-4405-b449-7317afbb4864.mov

[
  {
    "height": 300,
    "width": 300,
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
    "meta": {
      "parse": "grid",
      "description": "Initial data"
    },
    "data": {
      "values": [
        {
          "n": 100
        }
      ]
    },
    "mark": {
      "type": "point",
      "filled": true
    },
    "encoding": {
      "x": {
        "field": "datamations_x",
        "type": "quantitative",
        "axis": null
      },
      "y": {
        "field": "datamations_y",
        "type": "quantitative",
        "axis": null
      }
    }
  },
  {
    "height": 300,
    "width": 300,
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
    "meta": {
      "parse": "grid",
      "description": "Group by group",
      "splitField": "group",
      "axes": false
    },
    "data": {
      "values": [
        {
          "group": "group1",
          "n": 50
        },
        {
          "group": "group2",
          "n": 50
        }
      ]
    },
    "mark": {
      "type": "point",
      "filled": true
    },
    "encoding": {
      "x": {
        "field": "datamations_x",
        "type": "quantitative",
        "axis": null
      },
      "y": {
        "field": "datamations_y",
        "type": "quantitative",
        "axis": null
      },
      "color": {
        "field": null,
        "type": "nominal"
      },
      "tooltip": [
        {
          "field": "group",
          "type": "nominal"
        }
      ]
    }
  }
] 

100 points to 48/52 (to see the crossover when the number isn't round to 10 / sqrt(1000)

https://user-images.githubusercontent.com/15895337/138937175-30a9f210-c5d1-4fe0-8921-52dd390ef08d.mov

[
  {
    "height": 300,
    "width": 300,
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
    "meta": {
      "parse": "grid",
      "description": "Initial data"
    },
    "data": {
      "values": [
        {
          "n": 100
        }
      ]
    },
    "mark": {
      "type": "point",
      "filled": true
    },
    "encoding": {
      "x": {
        "field": "datamations_x",
        "type": "quantitative",
        "axis": null
      },
      "y": {
        "field": "datamations_y",
        "type": "quantitative",
        "axis": null
      }
    }
  },
  {
    "height": 300,
    "width": 300,
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
    "meta": {
      "parse": "grid",
      "description": "Group by group",
      "splitField": "group",
      "axes": false
    },
    "data": {
      "values": [
        {
          "group": "group1",
          "n": 48
        },
        {
          "group": "group2",
          "n": 52
        }
      ]
    },
    "mark": {
      "type": "point",
      "filled": true
    },
    "encoding": {
      "x": {
        "field": "datamations_x",
        "type": "quantitative",
        "axis": null
      },
      "y": {
        "field": "datamations_y",
        "type": "quantitative",
        "axis": null
      },
      "color": {
        "field": null,
        "type": "nominal"
      },
      "tooltip": [
        {
          "field": "group",
          "type": "nominal"
        }
      ]
    }
  }
] 

100 points to 87/13 (again to see the crossover)

https://user-images.githubusercontent.com/15895337/138937434-a4bbc02b-04b3-49b3-94f7-f9bc059224fd.mov

[
  {
    "height": 300,
    "width": 300,
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
    "meta": {
      "parse": "grid",
      "description": "Initial data"
    },
    "data": {
      "values": [
        {
          "n": 100
        }
      ]
    },
    "mark": {
      "type": "point",
      "filled": true
    },
    "encoding": {
      "x": {
        "field": "datamations_x",
        "type": "quantitative",
        "axis": null
      },
      "y": {
        "field": "datamations_y",
        "type": "quantitative",
        "axis": null
      }
    }
  },
  {
    "height": 300,
    "width": 300,
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
    "meta": {
      "parse": "grid",
      "description": "Group by group",
      "splitField": "group",
      "axes": false
    },
    "data": {
      "values": [
        {
          "group": "group1",
          "n": 87
        },
        {
          "group": "group2",
          "n": 13
        }
      ]
    },
    "mark": {
      "type": "point",
      "filled": true
    },
    "encoding": {
      "x": {
        "field": "datamations_x",
        "type": "quantitative",
        "axis": null
      },
      "y": {
        "field": "datamations_y",
        "type": "quantitative",
        "axis": null
      },
      "color": {
        "field": null,
        "type": "nominal"
      },
      "tooltip": [
        {
          "field": "group",
          "type": "nominal"
        }
      ]
    }
  }
] 

100 points to 20 groups of 5

https://user-images.githubusercontent.com/15895337/138937478-7de5d5d0-f40c-4f75-a4d8-92a8aef5bb90.mov

[
  {
    "height": 300,
    "width": 300,
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
    "meta": {
      "parse": "grid",
      "description": "Initial data"
    },
    "data": {
      "values": [
        {
          "n": 100
        }
      ]
    },
    "mark": {
      "type": "point",
      "filled": true
    },
    "encoding": {
      "x": {
        "field": "datamations_x",
        "type": "quantitative",
        "axis": null
      },
      "y": {
        "field": "datamations_y",
        "type": "quantitative",
        "axis": null
      }
    }
  },
  {
    "height": 300,
    "width": 300,
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
    "meta": {
      "parse": "grid",
      "description": "Group by group",
      "splitField": "group",
      "axes": false
    },
    "data": {
      "values": [
        {
          "group": "group1",
          "n": 5
        },
        {
          "group": "group10",
          "n": 5
        },
        {
          "group": "group11",
          "n": 5
        },
        {
          "group": "group12",
          "n": 5
        },
        {
          "group": "group13",
          "n": 5
        },
        {
          "group": "group14",
          "n": 5
        },
        {
          "group": "group15",
          "n": 5
        },
        {
          "group": "group16",
          "n": 5
        },
        {
          "group": "group17",
          "n": 5
        },
        {
          "group": "group18",
          "n": 5
        },
        {
          "group": "group19",
          "n": 5
        },
        {
          "group": "group2",
          "n": 5
        },
        {
          "group": "group20",
          "n": 5
        },
        {
          "group": "group3",
          "n": 5
        },
        {
          "group": "group4",
          "n": 5
        },
        {
          "group": "group5",
          "n": 5
        },
        {
          "group": "group6",
          "n": 5
        },
        {
          "group": "group7",
          "n": 5
        },
        {
          "group": "group8",
          "n": 5
        },
        {
          "group": "group9",
          "n": 5
        }
      ]
    },
    "mark": {
      "type": "point",
      "filled": true
    },
    "encoding": {
      "x": {
        "field": "datamations_x",
        "type": "quantitative",
        "axis": null
      },
      "y": {
        "field": "datamations_y",
        "type": "quantitative",
        "axis": null
      },
      "color": {
        "field": null,
        "type": "nominal"
      },
      "tooltip": [
        {
          "field": "group",
          "type": "nominal"
        }
      ]
    }
  }
] 

Originally posted by @sharlagelfand in https://github.com/microsoft/datamations/issues/102#issuecomment-952194920

jhofman commented 2 years ago

The 20 groups of 5 look kind of strange at the moment. @giorgi-ghviniashvili, can you visually prototype how these might look more intuitive and then we'll figure out how to make that algorithmic in datamations code?

giorgi-ghviniashvili commented 2 years ago

@jhofman @sharlagelfand I thought that maybe we can first collapse 10 groups into 5 and then animate right? Like this:

https://user-images.githubusercontent.com/6615532/140492806-07bdf9af-d1c1-4643-8a0e-f99556ded3bb.mov

But as I searched, we can utilize gemini2 to further instruct intermediate animations.. Check this out: https://uwdata.github.io/gemini2-editor/

jhofman commented 2 years ago

We all liked the idea of squashed vertical for these groups (whitespace on top and bottom) for these many small groups.

Maybe the heuristic here is that you don't want too much whitespace between points in a group in some "absolute" sense (relative to the size of the bounding box)?

@giorgi-ghviniashvili, can you play around with this and also with the "faceted grids" arranged as we drew on paper?

whatever we decide we'll need a generalizable rule for how to handle this.

giorgi-ghviniashvili commented 2 years ago

Adjusted y axis domain so it has good amount of padding.

https://user-images.githubusercontent.com/6615532/140720318-37d6cf43-be11-44ff-a318-83a90eba4033.mov

jhofman commented 2 years ago

Small note here that if we go to a group_by with a skewed number of points in one group, it can get hard to see the points in that large group.

Example here:

https://github.com/microsoft/datamations/issues/114#issuecomment-971887510

jhofman commented 2 years ago

@giorgi-ghviniashvili, can you think a bit about heuristics for min and max distance between points in a group, and how that could affect row/column layouts?

giorgi-ghviniashvili commented 2 years ago

@jhofman here is how I calculate rows now:

    // calculate rows: considering spec height for rows calculation

    const n = d3.max(vegaLiteSpecs[0].data.values, d => d.n);
    let rows = Math.ceil(Math.sqrt(n));

    const gap = 2;
    const distance = 6 + gap;

    const {height: firstSpecHeight} = vegaLiteSpecs[0].spec || vegaLiteSpecs[0];

    if (firstSpecHeight / rows > distance) {
      rows = Math.floor(firstSpecHeight / distance)
    }

    // end of rows calculation

And this looks like this now:

image

jhofman commented 2 years ago

@giorgi-ghviniashvili: can you re-render the test cases at the top of this issue using this new heuristic and see how it looks?

let's also check out how the covid by age and vax status here looks: https://github.com/microsoft/datamations/issues/115#issuecomment-972086912

giorgi-ghviniashvili commented 2 years ago

Here is how 20 groups look like. Better than top example:

https://user-images.githubusercontent.com/6615532/143875212-83fefdce-d129-4658-82b2-28665e11166c.mov

We can also apply max number of rows if you want, to avoid 3 columns. But for me this is ok, without max number of rows.

giorgi-ghviniashvili commented 2 years ago

52/48 is odd for me:

https://user-images.githubusercontent.com/6615532/143875696-08010128-8571-4c5b-975e-fa0ad2b60250.mov

giorgi-ghviniashvili commented 2 years ago

50/50 is same as 48. Odd. Maybe we can apply some threshold and if number of circles are greater than this threshold, say 200 or 300, then calculate row number based on min distance in order to avoid overlapping, otherwise, just use square root way. What do you think? @jhofman . If there are few points in total: 100 or less, it makes the viz very narrow and vertical. Don't like it.

jhofman commented 2 years ago

agreed that the newer version gets a little weird and might not generalize well.

seems like we may want to go with something more like this but try to adjust the gap between points in subsequent frames.

@giorgi-ghviniashvili will look at this for the next mtg.

jhofman commented 2 years ago

this will be top priority for @giorgi-ghviniashvili.

once done, @sharlagelfand will merge in ALL THE THINGS.

giorgi-ghviniashvili commented 2 years ago

So the original issue why we started to consider gap for rows calculating was the overlapping circles:

image

We could increase rows and have less columns to avoid overlapping.

For this have in mind, here is my solution:

Here is the code and result:

let maxCols = Math.ceil(d3.max(specValues, d => d.n) / rows);

// if width divided by maxCols is less than 5, 
// then take up all vertical space to increase rows and reduce columns 
if (specWidth / maxCols < 5) {
  rows = Math.floor(specHeight / distance);
  maxCols = Math.ceil(d3.max(specValues, d => d.n) / rows);
}

Result: image

image

It will stay same squared root rows, because we don't really need to adjust the rows here. They don't overlap!!!

image

image

So to sum up, the funky group is still there. I guess increase of row number can not fix that, but at least we can MERGE.

sharlagelfand commented 2 years ago

Closed by #118