Open leifwalsh opened 10 years ago
That sounds like a bug. Can you tell me a little more about your data stream? What are you measuring and what is its distribution? Also, can you estimate how many Inserts until you see the situation you're describing?
If you could write a test that reproduces this problem, I would be very grateful. If not, I can do it.
I am measuring database operation latencies in milliseconds. Values are about 10 and up, mostly normal around about 50ms but with a long, long tail going up to around 30000ms in some cases. It takes hours or days, and there are probably (remembering a while back here) 10-50 samples per second. I'm using a stream initialized with 50%, 90%, and 99% quantiles.
Cheers, Leif
On Mon, Apr 28, 2014 at 11:44 PM, Blake Mizerany notifications@github.com wrote:
If you could write a test that reproduces this problem, I would be very grateful. If not, I can do it.
Reply to this email directly or view it on GitHub: https://github.com/bmizerany/perks/issues/8#issuecomment-41639221
@leifwalsh I just re-read your issue. Will you please explain what you mean by "presumably because all the input in that time was above 50%."? It sounds like what you're saying is that 100% is above 50%, which makes no sense. I don't think this is what you mean, I just need some clarification.
You're right, that doesn't make any sense. I'm not sure what I meant by that.
The problem is that almost all reports are zeroes, and then every once in a while I get some large positive numbers and all three quantiles are the same. I was trying to guess at the root cause of that behavior and made a nonsensical guess. :)
Also, #10 looks like a plausible reproducer, thanks to @aybabtme for that!
I have a long-running application which has two streams (set to report 50%, 90%, and 99% quantiles) that both receive the same input data (latencies as float64 milliseconds from database operations). One stream gets reset each second after reporting a few quantiles, the other one reports at the same time but never gets reset. I've noticed a few times that after long enough, the stream that gets reset periodically starts reporting all 0s almost all the time, except when it looks like there were only one or two inputs in the window between resets, in which case all quantiles report the same number, presumably because all the input in that time was above 50%.
I'm not sure if this is a bug but I would like to start by getting advice on how to gather useful diagnostics from the data structure.