Greater control in downsampling used in plots

allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

https://clear.ml/docs

Apache License 2.0

5.42k stars 643 forks source link

Greater control in downsampling used in plots #1266

Open tensorpro opened 1 month ago

tensorpro commented 1 month ago

Proposal Summary

Allow more granular controls over how data is downsampled when its plotted.

Motivation

Currently, when we compare a run in progress to a completed run, the smoothing is inconsistently applied. Since both the run in progress and previous run are downsampled to ~5000 steps, the completed run appears much smoother than a run in progress. This prevents the plots from being useful for comparing partially completed runs

Related Discussion

If this continues a slack thread, please provide a link to the original slack thread.

ainoam commented 1 month ago

Thanks for suggesting @tensorpro.

Sounds like this becomes an issue where a run of interest contains significantly less data points than the others. If one were to naively adjust the sampling to match the run with the smallest number of points, this could incur an unnecessary load as significantly more data points would be retrieved from the other runs.

Perhaps a better approach for this use case could be to equalize the x-axis range so that only the points being compared would be retrieved (and as a result, the same sampling would be applied)?

tensorpro commented 1 month ago

That sounds like a good approach 😃

I might be able to help add this functionality with a PR if this feature seems like it would be merged.