ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
I noticed, that if there is a high number of samples in the scalar logging, the resulted plot is subsampled. There is also a difference between small and enlarged plot. I understand, that is might be good idea to subsample due to performance issues. however, it would be great, if the degree of subsampling would be adjustable or could be turned off if necessary.
Motivation
Differences in subsampling in different modes and lack of control over it leads to confusion and difficulties in evaluation process. Sometimes it looks like if scalar was only in certain bound (e.g. never lower that 80 %) but when I open the enlarged plot, it reveals, that in fact the value was much lower for some of the samples (e.g. 0 %). In comparison mode, there is also difference between old experiment that already has all samples logged and still running experiment, where the number of samples is still small. This leads to confusion in the experiments comparison - old one looks more stable than the one still running etc.
Proposal Summary
I noticed, that if there is a high number of samples in the scalar logging, the resulted plot is subsampled. There is also a difference between small and enlarged plot. I understand, that is might be good idea to subsample due to performance issues. however, it would be great, if the degree of subsampling would be adjustable or could be turned off if necessary.
Motivation
Differences in subsampling in different modes and lack of control over it leads to confusion and difficulties in evaluation process. Sometimes it looks like if scalar was only in certain bound (e.g. never lower that 80 %) but when I open the enlarged plot, it reveals, that in fact the value was much lower for some of the samples (e.g. 0 %). In comparison mode, there is also difference between old experiment that already has all samples logged and still running experiment, where the number of samples is still small. This leads to confusion in the experiments comparison - old one looks more stable than the one still running etc.