Closed HariSekhon closed 5 years ago
Got burned by this yesterday. I'll never get those hours back again :)
@bysse at least the answer and fix was just a search away... :)
Since many/most people using tcollector are probably using the latest official tcollector release (1.3.2), which predates this patch, it wouldn't be surprising if most tcollector users are now experiencing this problem. You may consider getting 1.3.3 out the door ASAP, or doing a quick point release with just this change.
Oddly, this was harder for me to track down than it otherwise would have been, because I somehow did get one data point every hour or two, so my graphs weren't completely missing data -- the data was just very, very coarse (temporally speaking).
Yes, I should have included a search keyword like Timestamp is too far out in the future
when posting my initial comment to help others find this issue. Not sure if we had any sporadic data points but we were getting disk alerts on the /var
partition and uncommitted journal alerts caused by syslog input flooding as well.
Yes, you are right, I will try and get that done.
Adding the OpenTSDB list as well.
On Mon, Sep 14, 2020, 1:34 PM JG notifications@github.com wrote:
Since many/most people using tcollector are probably using the latest official tcollector release (1.3.2), which predates this patch, it wouldn't be surprising if most tcollector users are now experiencing this problem. You may consider getting 1.3.3 out the door ASAP, or doing a quick point release with just this change.
Oddly, this was harder for me to track down than it otherwise would have been, because I somehow did get one data point every hour or two, so my graphs weren't completely missing data -- the data was just very, very coarse (temporally speaking).
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/OpenTSDB/tcollector/issues/405#issuecomment-692204398, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABFYMJEMHJGXNXLGAXQ6DTSFZH2VANCNFSM4FROUXEA .
yes,i got it,our TCollector all down at 2020-09-13 ....
This isn't that far away, the safety number in the following line of code should probably be increased to something a lot bigger before this breaks everyone's tcollector metrics:
Maybe something more like this:
This will give everyone a chance to come back from Christmas / New Year break and be back around their desks when it breaks, if anybody is still using TCollector that far in the future and AI hasn't replaced all of our jobs by then....