pinpoint-apm / pinpoint

APM, (Application Performance Management) tool for large-scale distributed systems.
https://pinpoint-apm.gitbook.io/
Apache License 2.0
13.39k stars 3.75k forks source link

Agent logs are being deleted (and not recreated): Error writting to stream / Stale file handle errors #8279

Open amilhub opened 3 years ago

amilhub commented 3 years ago

What version of pinpoint are you using?

master/v2.3.3 (docker)

Describe your problem**

Agent logs are being deleted, and not recreating again folder and pinpoint.log & pinpoint_stat.log. So in this case, tomcat application is giving errors due the agent logs files can't be found anymore.

Logs

Tomcat -> catalina.out (continuous error messages)

2021-09-28 17:05:34,723 Pinpoint-DefaultChannelzScheduledReporter(0-0) ERROR Recovering from StringBuilderEncoder.encode('09-28 17:05:34.034 [edReporter(0-0)] INFO c.n.p.m.SpanChannel -- 3-Socket-6 {keepAlivesSent=20400, lastMessageSentTime=1632833792682, lastMessageReceivedTime=0, localFlowControlWindow=1048576, remoteFlowControlWindow=843398} ') error: org.apache.logging.log4j.core.appender.AppenderLoggingException: Error writing to stream /mount_point/pinpoint-agent-2.3.0/logs/application_name//pinpoint_stat.log org.apache.logging.log4j.core.appender.AppenderLoggingException: Error writing to stream /mount_point/pinpoint-agent-2.3.0/logs/application_name//pinpoint_stat.log ... Caused by: java.io.IOException: Stale file handle at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:326) at org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStreamManager.java:261) ... 43 more

Additional context

If i reboot the tomcat node, the folder & .log files are created correctly again. It seems about 00.00 the files are being deleted and is not being recreated again. Note: it would be valid as provisional workaround disabling all agent logs (so in this way no need to write logs, so the issue will not reproduce).

emeroad commented 3 years ago
Caused by: java.io.IOException: Stale file handle
at java.io.FileOutputStream.writeBytes(Native Method) <- OS level API
at java.io.FileOutputStream.write(FileOutputStream.java:326)

java.io.IOException: Stale file handleis being thrown from OS

In my opinion, it is better to check the system, OS, and disk. Especially if you are using remote storage such as NFS, check your system.

https://www.baeldung.com/linux/stale-file-handles The most common scenarios where stale file handles are not refreshed are NFS or CIFS mounted shared directories.

amilhub commented 3 years ago

Yes, I am using NFS for logs. I only receive the "Stale file handle" in tomcat logs. At OS level any issue or message related to NFS. Simply logs the folder & files are deleted and the tomcat node address it.

Btw, my NFS options was originally: timeo=40,vers=3

Now: defaults,hard,intr,vers=3

In any case, if I restart tomcat the log folder & files are being cleared and recreated correctly.

In any case, due continuous log with this error it is difficult to trace the application log, so while i am trying with NFS options:

  1. There is a way to disable completely the agent logs? Maybe /profiles/release/log4j2.xml is a good point of start, but i'm not sure how to do.
  2. Another valid solution for me could be, disabling the "mainteinance" job that delete log folder and its files (so i can manually if required a manual cron job that simply truncate the log files). So, there is a way to disable this agent deletion job?
amilhub commented 3 years ago

No luck with the new mount options for NFS :(