iterative / dvc

🦉 Data Versioning and ML Experiments
https://dvc.org
Apache License 2.0
13.87k stars 1.18k forks source link

plots: incorporate strokedash into dvc #8970

Closed dberenbaum closed 10 months ago

dberenbaum commented 1 year ago

There's a rough POC of the ideas above in https://github.com/iterative/dvc/tree/plots-fields and https://github.com/iterative/dvc-render/tree/plots-fields.

These changes updates output to include these new fields:

```json
"dvc_id": "workspace::train/acc.tsv::train/acc",
"dvc_rev": "workspace",
"dvc_filename": "train/acc.tsv",
"dvc_field": "train/acc",
"dvc_source": "train/acc.tsv::train/acc"
```

Here's an example of dvc plots diff --split --json output:

{
  "dvc.yaml::Accuracy": [
    {
      "type": "vega",
      "revisions": [
        "HEAD",
        "workspace"
      ],
      "content": {
        "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
        "data": {
          "values": "<DVC_METRIC_DATA>"
        },
        "title": "dvc.yaml::Accuracy",
        "width": 300,
        "height": 300,
        "params": [
          {
            "name": "smooth",
            "value": 0.001,
            "bind": {
              "input": "range",
              "min": 0.001,
              "max": 1,
              "step": 0.001
            }
          }
        ],
        "layer": [
          {
            "mark": "line",
            "encoding": {
              "x": {
                "field": "step",
                "type": "quantitative",
                "title": "step"
              },
              "y": {
                "field": "dvc_inferred_y_value",
                "type": "quantitative",
                "title": "accuracy",
                "scale": {
                  "zero": false
                }
              },
              "color": {
                "field": "dvc_rev",
                "type": "nominal"
              },
              "strokeDash": {
                "field": "dvc_source",
                "type": "nominal"
              }
            },
            "transform": [
              {
                "loess": "dvc_inferred_y_value",
                "on": "step",
                "groupby": [
                  "dvc_rev",
                  "dvc_source"
                ],
                "bandwidth": {
                  "signal": "smooth"
                }
              }
            ]
          },
          {
            "mark": {
              "type": "point",
              "tooltip": {
                "content": "data"
              }
            },
            "encoding": {
              "x": {
                "field": "step",
                "type": "quantitative",
                "title": "step"
              },
              "y": {
                "field": "dvc_inferred_y_value",
                "type": "quantitative",
                "title": "accuracy",
                "scale": {
                  "zero": false
                }
              },
              "color": {
                "field": "dvc_rev",
                "type": "nominal"
              },
              "strokeDash": {
                "field": "dvc_source",
                "type": "nominal"
              }
            }
          }
        ]
      },
      "datapoints": {
        "workspace": [
          {
            "step": "0",
            "train/acc": "0.5262",
            "dvc_inferred_y_value": "0.5262",
            "rev": "workspace::train/acc.tsv::train/acc",
            "dvc_id": "workspace::train/acc.tsv::train/acc",
            "dvc_rev": "workspace",
            "dvc_filename": "train/acc.tsv",
            "dvc_field": "train/acc",
            "dvc_source": "train/acc.tsv::train/acc"
          },
...

Originally posted by @dberenbaum in https://github.com/iterative/vscode-dvc/issues/3130#issuecomment-1407395008

dberenbaum commented 1 year ago

Context: this should allow dvc to have consistent behavior across its own html output, vs code, and studio, without needing vs code and studio to manipulate the plots themselves as much. The goal here should be to reduce the custom manipulation each product needs to render plots.

mattseddon commented 1 year ago

Related to #9018

dberenbaum commented 10 months ago

Fixed by @mattseddon