timescale / tsbs

Time Series Benchmark Suite, a tool for comparing and evaluating databases for time series data
MIT License
1.26k stars 299 forks source link

ClampedRandomWalkDistribution with NormalDistribution with mean > 0 will semi-permanently generate maximum value #150

Open fsolleza opened 3 years ago

fsolleza commented 3 years ago

Relevant Files pkg/data/usecases/common/distribution.go

Problem: If you instantiate a ClampedRandomWalkDistribution by common.CWD(0, 1000, common.ND(50, 1)), and Advance() enough times, you'll constantly generate 1000 at some point. I don't know if this was the intent of ClampedRandomWalkDistribution but it doesn't seem like that's what was expected. Instead, the expectation was to have a random walk with a mean around 50, but never went below 0 or above 1000.

Diagnosis: The problem lies in that the mean returned by the NormalDistribution is around 50. This frequently positive value is added to the State of the ClampedRandomWalkDistribution during each Advance(). At some point, State > 1000 which results in State = 1000. Because the mean of the underlying NormalDistribution is 50, it's unlikely that State becomes less than 1000 after this happens.

Potential solution: ClampedRandomWalkDistribution should always be initialized with common.ND(0, stdev) where stdev is the standard deviation needed. ClampedRandomWalkDistribution should also have an Offset attribute. If the random walk should be around some mean value, then Get() should return Offset + State. Adjustments to the Max and Min cutoff are also required.

Other notes: A similar issue exists with RandomWalk however this isn't totally incorrect even if the behavior is not what I think was intended. If RandomWalk was initialized with common.ND(50, 1), then RandomWalk will take steps with a mean of +50. As a result, RandomWalk would be almost always monotonically increasing (with mean >> 0). Overflow issues not withstanding.

fsolleza commented 3 years ago

A simpler alternative is that wherever ClampedRandomWalkDistribution is initialized by common.CWD(0, 1000, common.ND(50, 1), 0), instead, it should be initialized by common.CWD(0, 1000, common.ND(0, 1), 50). See for example, pkg/data/usecases/common/usecases/devops/redis.go