Open ChenitoAf opened 1 month ago
This might be due to a issue on telegraf. I know it stops reading the data from the OPC server in between all of a sudden. I have raised a similar issue to them. May be it requires some extra module handling inside the connector.
I did some more tests.
As shown, so far I noticed that the longest was 2.9s for telegraf to finish the writing process and from time to time, a record like did not complete within its flush interval
.
So, I slightly reduced the metric_batch_size
and increased flush_jitter
a little bit.
The result:
but somehow, record for input like Previous collection has not completed; scheduled collection skipped
appeared more often. Yeah, more failures in collecting the data.
Relevant telegraf.conf
Logs from Telegraf
System info
Telegraf V1.33 on windows
Docker
No response
Steps to reproduce
Expected behavior
Data comes in every 1s.
Actual behavior
Most of the time, the data comes in every 1s. But randomly, it is missing 1 data point in a random measurement.
Sometimes, 1 data point in measurement 1, sometimes in measuremet 2, very random. Even in the same measurement, sometimes, only part of the values are missing this 1 data point, not all of them. (As show from the stack chart below)
From time to time, you can see in the telegraf log file, it couldn't make it in the flush interval.
Additional info
I have tried to increase
metric_batch_size
andmetric_buffer_limit
and raised the flush rate withflush_interval
andflush_jitter
. But so far, this issue keeps pumping up. As a rough idea, it's missing 2-3 data points every minute.In other applications where I am collecting 1000-2000 data points withs 1s interval, there's no such issue of a missing data point. Only when I increased the data amount, it appeared.
Could you please help me to take a look.
I am flexiable to test any ideas.
Great thanks.