Open giangbang opened 1 year ago
Hi @giangbang, thanks for using expt and reporting the issue with a full details about how to reproduce.
I think your data fluctuates a lot, success rate being 0 or 1 with no averaging/smoothing applied in the raw data of this series:
Therefore, depending on the sampling frequency we can observe such "aliasing artifacts" when the correlation between the data and the axis (global_step) is high. If the sampling frequency didn't have much luck, it would have sampled more "0" data points (and hence the dip). As you correctly guessed, the main reason for this very noticeable difference is the data would be interpolated after re-sampling, and then smoothed via df.rolling
.
My suggestion is that you can make your log data smoothed when writing scalar data to avoid the aliasing problem, or pre-process the data via expt (which can be a bit more flexible) which would make the subsampled data statistically stable and consistent:
One downside of this is that we smooth the data before subsampling and we would be doing rolling-window smoothing twice; because another purpose of subsampling data is to make speed faster (by having smaller number of rows in the dataframe), as well as "smoothing" the curve for a better look. In your particular case you may want to have the data subsampled BUT not interpolated. For the sake of this, I am also going to add a more flexible API that would allow subsampling the data without linear interpolation.
Hello @wookayin , i'm using
expt
in one of my project and i have recently noticed probably a bug when plotting learning results of RL agent. Here is the image of the plot, both are actually using the same data, the only difference is the value ofn_sample
. There is a drop in performance in the left but by merely changing sample frequency eliminates it. I suspect that the problem is that interpolation is carried out first before the rolling step, and weird things can happen when the sampling frequency has something to do with the frequency of data, which can be the case in some rare situation.The code for reproducing the plot can be found in here.
(I'm using the latest commit of
expt
.)