Open SoyGema opened 1 year ago
It's more related to dvc-render to be honest, but let's keep it here for now (cc @daavoo @dberenbaum ) . Some customers (Studio) also were mentioning broken and over aggressive smoothing which forced them to stay with TB. We need to got back and research the templates again I guess. Making p1
, assigning to myself, David, and Dave. We'll try to get to it.
It's more related to dvc-render to be honest, but .... -->Hey thanks for this. From now on I´ll make the effort to explore codebases to report bugs more efficiently! We'll try to get to it. --> You´ve got it!
My $0.02 is (even on a laptop) when the plot is given the screen's entire space that the experience isn't that bad:
On a wide screen you get an even better sense of what is going on:
I think the underlying problems here are:
We are incorrectly using the vega transformr fo smoothing linear charts. The transform we use is meant for scatter plots: https://vega.github.io/vega/examples/loess-regression/ People is used to a behavior based in something like a exponential moving average (i.e. the one used in Tensorboard)
Related to above, displaying the original unconnected points make sense for the regression case but not for people expecting the "TensorBoard behavior"
Generally, there seems to be some aggressive smoothing for what I'd expect, especially at the edges of plots. If there's interpolation with linear, where smoothness of 1 is linear, then I can definitely see why that is. I would not expect linear--I guess I'm someone expecting more like the "TensorBoard behavior". @daavoo , I think that first bullet is exactly where I'm coming from
For an example of what I'd expect, I'm attaching examples what tensorboard does
And this is studio:
I'm looking into this (trying different Vega hacks). For the record, this is the Vega dump that can be used in the Vega editor:
I think we need this https://github.com/vega/vega/pull/3686 (+change the way we show lines a bit). + use window
transform on top of the exponential avg https://stackoverflow.com/questions/55996589/how-to-layer-a-moving-average-on-line-chart-with-vega-lite
This seems solved . Therefore closing it
@SoyGema hey, we've improved it, but it's not solved yet :( we need to fix the way smooth actually works via https://github.com/vega/vega/pull/3686
Hey, My apologies. Saw a Merged that looked good. :) keep it up! Please, consider in the future a policy -the label is a great idea! thanks for adding it - to understand bug-reporting / issue impact / scope from the contributor perspective. :) Seems that the ownership/responsibility of the issue goes to Iterative team from the start, therefore I might not follow along with actions . Let me know if this hypothesis is correct.
Thanks ! Have a nice day!
Opened https://github.com/iterative/dvc-render/issues/135 since I don't think there's much VS Code can do.
Since we are waiting on https://github.com/vega/vega/pull/3686 and it doesn't appear to be moving right now, is it worth considering other options?
The quickest fix is to move to a simple (non-exponential) moving average, which is similar to what we want except for having an unweighted, fixed window. Here's how it looks in comparison to the current smoothing (old smoothing shown by smooth
; new moving average shown by rolling_window
):
https://github.com/iterative/vscode-dvc/assets/2308172/27778517-e215-4025-a954-9d407eebd81b
We could also consider starting to move towards plotly (see https://github.com/iterative/dvc-render/issues/7). It already has triangular moving average, which is probably close enough to exponential moving average.
Also note that the tensorboard example @SoyGema also looks off to me. In the last row with the max smoothing parameter, the smoothed line looks way below the actual trend of the points:
This still hasn't propagated to Vega Lite and is blocked.
I think it's on us to propagate it. It seemed more or less straightforward the last time I checked and the vega-light is moving faster. Let's take a look please.
I think it's on us to propagate it. It seemed more or less straightforward the last time I checked and the vega-light is moving faster. Let's take a look please.
In that case, I don't think I'm the person for that job as I am quite lost in all this. I would not know if I'm doing the right thing or not. I'll unassign myself.
CONTEXT : Neural Machine Translation training scenario.
1.7.8
2.56.0
0.8.7
The smoothing feature shows two lines instead of one. The thicker green line should be transformed as the UI smoothing basr is moving through the bar but it´s not.
https://user-images.githubusercontent.com/24204714/236618849-5b88dd69-e1ae-4f40-a9ee-2993ac6248c6.mp4
From conversation in Discord
Increasing plot size test seems to have the same issue
Machine Learning Epicstemics coming from the conversation in discord
The signal or number of datapoints can increase significantly when we change the
batch size parameter
during experiments : batch size controls the accuracy of the estimate of the error gradient when training Neural Networks. Too large batch sizes -less data points- may cause bad generalization. Many data points (smaller batch sizes ) take more training time but might have better generalization.Iterating over this parameter can be a common case of experimentation scenario.
NOTE for machine learning practitioners: this parameter starts with 8 and increases by powers of 2. Experimenting with this parameter is key for better generalization by having other rational hypotheses in mind. This article gives good hints.
For developers
Nice work !