Open maxfliri opened 1 year ago
hi @maxfliri could you collect "bad case" logs from agent running with this env var RUST_LOG=debug? full log.
hi @dkhokhlov
I tried to run the agent with RUST_LOG=debug in our testing environment, but unfortunately it keeps crashing and restarting when I do so. I think the agent is somehow crashing under the load of its own logs. I tried to exclude the agent's own logs with LOGDNA_EXCLUSION_RULES="/var/log/containers/logdna-agent*.log", but the agent keeps crashing anyway.
fix PR: https://github.com/logdna/logdna-agent-v2/pull/496
there is also another fix in development now for the same issue. cc: @jakedipity
To clarify, that PR fixes a bug we've identified where the agent will re-ingest logs everytime it's internal tailing system is rebuilt (by default every 6 hours). This bug has been around since before 3.6, but was exacerbated by the regular restarts introduced in 3.7.
It's very likely that is causing the issue you're seeing, but there's not enough information to say for certain.
@maxfliri The agent crashing is interesting, if possible could you get us the log output? I don't see the same behavior when enabling debug logs.
I'm seeing a lot of duplicated lines since upgrading to v3.8.0. As an example, this line was reported multiple times over about a week of time:
If I print the container logs using kubectl, I see the same line only once:
I'm running logdna-agent v3.8.0 in kubernetes. If I revert to the version I was using before (v2.2.4), the problem goes away.