apache / trafficserver

Apache Traffic Server™ is a fast, scalable and extensible HTTP/1.1 and HTTP/2 compliant caching proxy server.
https://trafficserver.apache.org/
Apache License 2.0
1.82k stars 804 forks source link

Logs to stdout or stderr stops after a few minutes #8955

Open Tristan971 opened 2 years ago

Tristan971 commented 2 years ago

While trying to run ATS in a Docker container, I thought I'd make use of #7937 as a nice way to handle logs.

So I built branch 9.2.x, specifically 15bea4dd946c8cb6fc2000a4b31cf4f2f261b29d and set the following:

In records.config:

CONFIG proxy.node.config.manager_log_filename STRING stderr
CONFIG proxy.config.diags.logfile.filename STRING stderr
CONFIG proxy.config.error.logfile.filename STRING stderr

In logging.yaml:

logging:
...
  logs:
    - mode: ascii
      format: common
      filename: stderr

Then I use traffic_manager without arguments as (non-root) PID 1 in the container.

At first things look alright, but after approximately 3 minutes, the logs just stop being output. Moving back error and diags to a file I can see:

[Jul 12 02:22:27.699] [LOG_FLUSH] ERROR: Failed to write log to stderr: [tried 101, wrote 0, Bad file descriptor]
[Jul 12 02:23:30.737] [LOG_FLUSH] ERROR: The following message was suppressed 12 times.
[Jul 12 02:23:30.737] [LOG_FLUSH] ERROR: Failed to write log to stderr: [tried 101, wrote 0, Bad file descriptor]

Looking into the container, I also notice the following:

$ echo "test" >> /proc/1/fd/2
$ echo "test" >> /proc/$(cat /run/trafficserver/server.lock)/fd/1
$ echo "test" >> /proc/$(cat /run/trafficserver/server.lock)/fd/2
bash: /proc/11/fd/2: No such file or directory

As one would expect, the first two commands cause "test" to be printed on the container logs, and the third does nothing.

It seems like stderr gets disconnected entirely at some point? A traffic_ctl server restart does bring logs back for another 3 minutes, but then the issue happens again. (note that the other ones work just fine)

At first I thought it would be some issue with ATS trying to "roll" the pipe or something akin to that, but adding CONFIG proxy.config.log.rolling_enabled INT 0 to records.config doesn't make a difference, so I guess it's not that.

If it is relevant, this was compiled and ran on a Debian Bullseye image, and the image in question is registry.gitlab.com/mangadex-pub/trafficserver:9.2.x-bullseye-dbcecd20.

Also, ATS otherwise keeps running just fine while that happens.

I doubt this will be particularly relevant, but here's the logs until they stop (when all set to stderr):

Logs ``` 2022-07-12T03:07:42 [E. Mgmt] log ==> [TrafficManager] using root directory '/usr' 2022-07-12T03:07:42 [Jul 12 02:07:42.435] traffic_manager STATUS: opened stderr 2022-07-12T03:07:42 [Jul 12 02:07:42.435] traffic_manager NOTE: updated diags config 2022-07-12T03:07:42 [Jul 12 02:07:42.437] traffic_manager NOTE: [LocalManager::listenForProxy] Listening on port: 8080 (ipv4) 2022-07-12T03:07:42 [Jul 12 02:07:42.437] traffic_manager NOTE: [TrafficManager] Setup complete 2022-07-12T03:07:43 [Jul 12 02:07:43.438] traffic_manager NOTE: [ProxyStateSet] Traffic Server Args: ' -M' 2022-07-12T03:07:43 [Jul 12 02:07:43.439] traffic_manager NOTE: [LocalManager::listenForProxy] Listening on port: 8080 (ipv4) 2022-07-12T03:07:43 [Jul 12 02:07:43.439] traffic_manager NOTE: [LocalManager::startProxy] Launching ts process 2022-07-12T03:07:43 [Jul 12 02:07:43.463] traffic_manager NOTE: [LocalManager::pollMgmtProcessServer] New process connecting fd '9' 2022-07-12T03:07:43 [Jul 12 02:07:43.463] traffic_manager NOTE: [Alarms::signalAlarm] Server Process born 2022-07-12T03:07:45 [Jul 12 02:07:45.472] traffic_server STATUS: opened stderr 2022-07-12T03:07:45 [Jul 12 02:07:45.472] traffic_server NOTE: updated diags config 2022-07-12T03:07:45 [Jul 12 02:07:45.483] traffic_server NOTE: storage.config loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.483] traffic_server NOTE: storage.config finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.492] traffic_server NOTE: ip_allow.yaml loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.493] traffic_server NOTE: ip_allow.yaml finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.494] traffic_server NOTE: parent.config loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.494] traffic_server NOTE: parent.config finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.498] traffic_server NOTE: /etc/trafficserver/logging.yaml loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.500] traffic_server NOTE: /etc/trafficserver/logging.yaml finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.504] traffic_server NOTE: logging initialized[3], logging_mode = 3 2022-07-12T03:07:45 [Jul 12 02:07:45.504] traffic_server NOTE: Initialized plugin_dynamic_reload_mode: 1 2022-07-12T03:07:45 [Jul 12 02:07:45.504] traffic_server NOTE: plugin.config loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.505] traffic_server NOTE: loading plugin '/usr/lib/trafficserver/modules/header_rewrite.so' 2022-07-12T03:07:45 [Jul 12 02:07:45.507] traffic_server NOTE: loading plugin '/usr/lib/trafficserver/modules/healthchecks.so' 2022-07-12T03:07:45 [Jul 12 02:07:45.510] traffic_server NOTE: plugin.config finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.515] traffic_server NOTE: /etc/trafficserver/sni.yaml loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.516] traffic_server NOTE: /etc/trafficserver/sni.yaml finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.516] traffic_server NOTE: ssl_multicert.config loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.517] traffic_server NOTE: /etc/trafficserver/ssl_multicert.config finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.518] traffic_server NOTE: volume.config loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.518] traffic_server NOTE: volume.config finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.534] traffic_server NOTE: remap.config loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.534] traffic_server NOTE: strategies.yaml loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.534] traffic_server NOTE: No NextHop strategy configs were loaded. 2022-07-12T03:07:45 [Jul 12 02:07:45.535] traffic_server NOTE: strategies.yaml finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.537] traffic_server NOTE: remap.config finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.544] [TS_MAIN] NOTE: traffic server running 2022-07-12T03:07:45 [Jul 12 02:07:45.544] [TS_MAIN] NOTE: Traffic Server is running unprivileged, not switching to user 'nobody' 2022-07-12T03:07:45 [Jul 12 02:07:45.767] [ET_NET 3] NOTE: recovery clearing offsets of Vol /var/cache/trafficserver/cache.db 32768:13107196 : [153985024, 162373632] sync_serial 33 next 34 2022-07-12T03:07:45 [Jul 12 02:07:45.928] [ET_NET 2] NOTE: hosting.config loading ... 2022-07-12T03:07:45 [Jul 12 02:07:45.933] [ET_NET 2] NOTE: hosting.config finished loading 2022-07-12T03:07:45 [Jul 12 02:07:45.951] [ET_NET 2] NOTE: cache enabled 2022-07-12T03:07:45 [Jul 12 02:07:45.951] [ET_NET 2] NOTE: Traffic Server is fully initialized. 2022-07-12T03:07:52 - - redacted - TCP_MISS 1ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:07:57 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:02 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:07 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:12 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:17 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:22 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:27 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:32 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:37 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:42 - - redacted - TCP_MISS 1ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:47 - - redacted - TCP_MISS 1ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:51 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:08:56 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:01 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:06 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:11 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:16 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:21 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:27 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:30 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:35 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:40 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:45 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:50 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:09:55 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:10:00 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:10:05 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:10:10 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:10:15 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 2022-07-12T03:10:20 - - redacted - TCP_MISS 0ms - "GET http://redacted/healthz HTTP/1.0" 200 3 [here it just abruptly stops] ```

Hopefully this helps track down the issue

bneradt commented 2 years ago

That is interesting. Thank you for the detailed description.

We (Yahoo) have properties that run with stderr logging and we haven't had anyone report seeing the logging stall. We should be able to get this to work, therefore.

Since this happens so quickly, I suggest enabling the log debug tag and monitor it for a few minutes until logging stops. Then see whether those debug logs provide any clues about what is going wrong:

CONFIG proxy.config.diags.debug.enabled INT 3
CONFIG proxy.config.diags.debug.tags STRING log

Note: Use 3 for the debug.enabled log, it is much more efficient than 1. After you collect the information you need, you'll likely want to disable debug logging so it doesn't impact production traffic.

Tristan971 commented 2 years ago

Thanks for the prompt reply.

First, I set debug.enabled to 1 in part because this is a dev server so the perf impact is irrelevant to me, but mostly because with 3 I wasn't seeing debug logs being printed at all 😅

However after a few tests I was surprisingly not always able to reproduce the issue. Looking at the git history (as I tried quite a few things yesterday), this is the piece of records.config that causes the issue:

CONFIG proxy.config.log.max_secs_per_buffer INT 1
CONFIG proxy.config.log.periodic_tasks_interval INT 1

When both lines are removed, logs work fine (at least for a while longer; it's not impossible the issue just takes longer to show in such case; currently started without both and waiting to find that out).

Commenting out log.max_secs_per_buffer alone does not fix the issue, and it doesn't make much sense to comment out log.periodic_tasks_interval alone (at least as per the docs).

Attached is either way the full diags.log when both are present (and I set diags to log to a file for the occasion, ofc)

diags.log

The last successful stderr output line was at 17:15:42, and the relevant (I suspect) bit is unfortunately not very informative:

[Jul 12 17:15:42.786] [LOG_FLUSH] DEBUG: <Log.cc:1421 (flush_thread_main)> (log) Successfully wrote some stuff to stderr
...
...
[Jul 12 17:15:46.833] [LOG_FLUSH] DEBUG: <LogFile.cc:275 (close_file)> (log-file) LogFile stderr is closed
[Jul 12 17:15:46.833] [LOG_FLUSH] DEBUG: <LogFile.cc:242 (open_file)> (log-file) writing header to LogFile stderr
[Jul 12 17:15:46.833] [LOG_FLUSH] DEBUG: <LogFile.cc:249 (open_file)> (log) exiting LogFile::open_file(), file=stderr presumably open
[Jul 12 17:15:46.833] [LOG_FLUSH] ERROR: Failed to write log to stderr: [tried 101, wrote 0, Bad file descriptor]
[Jul 12 17:15:47.633] [ET_NET 2] DEBUG: <traffic_server.cc:398 (periodic)> (log) in DiagsLogContinuation, checking on diags.log

is about the first error information in there

however I don't see anything else that'd be particularly obvious

Tristan971 commented 2 years ago

Nevermind the remark about the two lines removal fixing it; it does happen without them, just after approximately 7 minutes instead of 3.

Tristan971 commented 2 years ago

Tried a few more things to run the container as "standardly" as possible. Notably, starting traffic_manager (pid 1) as root, to no avail.

However something interesting seems to be that the traffic server process seems to get restarted around when it happens:

Jul 13 00:14:30 - - 1.2.3.4 - TCP_MISS 0ms - "GET http://$redacted/healthz HTTP/1.0" 200 3
Jul 13 00:14:35 - - 1.2.3.4 - TCP_MISS 0ms - "GET http://$redacted/healthz HTTP/1.0" 200 3
Jul 13 00:14:40 - - 1.2.3.4 - TCP_MISS 0ms - "GET http://$redacted/healthz HTTP/1.0" 200 3
Jul 13 00:14:45 Traffic Server 9.2.0+mangadex-105defc Jul 12 2022 23:49:12 runner-samzfqgz-project-37606524-concurrent-0ppjbl
Jul 13 00:14:45 traffic_server: using root directory '/usr'
Jul 13 00:14:45 [Jul 13 00:14:45.388] [LOG_FLUSH] ERROR: Failed to write log to stdout: [tried 101, wrote 0, Bad file descriptor]
Jul 13 00:15:49 [Jul 13 00:15:49.666] [LOG_FLUSH] ERROR: The following message was suppressed 12 times.
Jul 13 00:15:49 [Jul 13 00:15:49.666] [LOG_FLUSH] ERROR: Failed to write log to stdout: [tried 101, wrote 0, Bad file descriptor]

Here's the "extended" version of it, from the last Successfully wrote some stuff to stdout before TS restart; not very different from before however

Details ```plain Jul 13 00:23:20 [Jul 13 00:23:20.522] [LOG_FLUSH] DEBUG: (log) Successfully wrote some stuff to stdout Jul 13 00:23:21 [Jul 13 00:23:21.153] [ET_NET 2] DEBUG: (log) in DiagsLogContinuation, checking on diags.log Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_PREPROC 0] DEBUG: (log-api-mutex) A LogObjectManager::preproc_buffers Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_PREPROC 0] DEBUG: (log-api-mutex) R LogObjectManager::preproc_buffers Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_PREPROC 0] DEBUG: (log-preproc) 0 buffers preprocessed from LogConfig 0x575b02275440 (refcount=2) this round Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_FLUSH] DEBUG: (log-preproc) periodic tasks for 1657671801 Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_FLUSH] DEBUG: (log-api-mutex) entering Log::periodic_tasks Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_FLUSH] DEBUG: (log-periodic) Performing periodic tasks Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_FLUSH] DEBUG: (log-periodic) Periodic task interval = 1 Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::check_buffer_expiration Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::check_buffer_expiration Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::roll_files Jul 13 00:23:21 [Jul 13 00:23:21.517] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::roll_files Jul 13 00:23:22 [Jul 13 00:23:22.153] [ET_NET 2] DEBUG: (log) in DiagsLogContinuation, checking on diags.log Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_FLUSH] DEBUG: (log-preproc) periodic tasks for 1657671802 Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_PREPROC 0] DEBUG: (log-api-mutex) A LogObjectManager::preproc_buffers Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_FLUSH] DEBUG: (log-api-mutex) entering Log::periodic_tasks Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_PREPROC 0] DEBUG: (log-api-mutex) R LogObjectManager::preproc_buffers Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_FLUSH] DEBUG: (log-periodic) Performing periodic tasks Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_FLUSH] DEBUG: (log-periodic) Periodic task interval = 1 Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_PREPROC 0] DEBUG: (log-preproc) 0 buffers preprocessed from LogConfig 0x575b02275440 (refcount=2) this round Jul 13 00:23:22 [Jul 13 00:23:22.512] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:22 [Jul 13 00:23:22.513] [LOG_FLUSH] DEBUG: (logspace) 0 bytes being used for logs Jul 13 00:23:22 [Jul 13 00:23:22.513] [LOG_FLUSH] DEBUG: (logspace) 4171866112 bytes left on partition Jul 13 00:23:22 [Jul 13 00:23:22.513] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:22 [Jul 13 00:23:22.513] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:22 [Jul 13 00:23:22.513] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::check_buffer_expiration Jul 13 00:23:22 [Jul 13 00:23:22.513] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::check_buffer_expiration Jul 13 00:23:22 [Jul 13 00:23:22.513] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::roll_files Jul 13 00:23:22 [Jul 13 00:23:22.513] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::roll_files Jul 13 00:23:23 [Jul 13 00:23:23.153] [ET_NET 2] DEBUG: (log) in DiagsLogContinuation, checking on diags.log Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_FLUSH] DEBUG: (log-preproc) periodic tasks for 1657671803 Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_FLUSH] DEBUG: (log-api-mutex) entering Log::periodic_tasks Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_FLUSH] DEBUG: (log-periodic) Performing periodic tasks Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_FLUSH] DEBUG: (log-periodic) Periodic task interval = 1 Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::check_buffer_expiration Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::check_buffer_expiration Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::roll_files Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::roll_files Jul 13 00:23:23 [Jul 13 00:23:23.508] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:23 [Jul 13 00:23:23.509] [LOG_PREPROC 0] DEBUG: (log-api-mutex) A LogObjectManager::preproc_buffers Jul 13 00:23:23 [Jul 13 00:23:23.510] [LOG_PREPROC 0] DEBUG: (log-api-mutex) R LogObjectManager::preproc_buffers Jul 13 00:23:23 [Jul 13 00:23:23.510] [LOG_PREPROC 0] DEBUG: (log-preproc) 0 buffers preprocessed from LogConfig 0x575b02275440 (refcount=2) this round Jul 13 00:23:24 [Jul 13 00:23:24.153] [ET_NET 2] DEBUG: (log) in DiagsLogContinuation, checking on diags.log Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_PREPROC 0] DEBUG: (log-api-mutex) A LogObjectManager::preproc_buffers Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_PREPROC 0] DEBUG: (log-api-mutex) R LogObjectManager::preproc_buffers Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_FLUSH] DEBUG: (log-preproc) periodic tasks for 1657671804 Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_PREPROC 0] DEBUG: (log-preproc) 0 buffers preprocessed from LogConfig 0x575b02275440 (refcount=2) this round Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_FLUSH] DEBUG: (log-api-mutex) entering Log::periodic_tasks Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_FLUSH] DEBUG: (log-periodic) Performing periodic tasks Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_FLUSH] DEBUG: (log-periodic) Periodic task interval = 1 Jul 13 00:23:24 [Jul 13 00:23:24.510] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:24 [Jul 13 00:23:24.511] [LOG_FLUSH] DEBUG: (logspace) 0 bytes being used for logs Jul 13 00:23:24 [Jul 13 00:23:24.511] [LOG_FLUSH] DEBUG: (logspace) 4171866112 bytes left on partition Jul 13 00:23:24 [Jul 13 00:23:24.512] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:24 [Jul 13 00:23:24.512] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:24 [Jul 13 00:23:24.513] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::check_buffer_expiration Jul 13 00:23:24 [Jul 13 00:23:24.513] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::check_buffer_expiration Jul 13 00:23:24 [Jul 13 00:23:24.513] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::roll_files Jul 13 00:23:24 [Jul 13 00:23:24.514] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::roll_files Jul 13 00:23:24 [Jul 13 00:23:24.867] [ET_NET 3] DEBUG: (log-buffer) Nothing to log, bytes_needed = 0 Jul 13 00:23:25 [Jul 13 00:23:25.150] [ET_NET 2] DEBUG: (log) in DiagsLogContinuation, checking on diags.log Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-preproc) periodic tasks for 1657671805 Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-api-mutex) entering Log::periodic_tasks Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-periodic) Performing periodic tasks Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-periodic) Periodic task interval = 1 Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-logbuffer) [0x795468600010] Created buffer 33 for stdout at address 0x795468998000, size 9216 Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-logbuffer) adding buffer 32 to flush list after checkout Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::check_buffer_expiration Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::check_buffer_expiration Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::roll_files Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::roll_files Jul 13 00:23:25 Traffic Server 9.2.0+mangadex-105defc Jul 12 2022 23:49:12 runner-samzfqgz-project-37606524-concurrent-0ppjbl Jul 13 00:23:25 traffic_server: using root directory '/usr' Jul 13 00:23:25 The following format was added to the global format list Jul 13 00:23:25 -------------------------------------------------------- Jul 13 00:23:25 Format : common (LOG_FORMAT_CUSTOM) (0x575b0227d700), 7 fields. Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:25 Symbols: {X-Request-ID}cqh,chi,crc,ttms,cqtx,pssc,pscl Jul 13 00:23:25 Fields : Jul 13 00:23:25 X-Request-ID cqh STR Jul 13 00:23:25 client_host_ip chi IP Jul 13 00:23:25 cache_result_code crc sINT Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-api-mutex) A LogObjectManager::preproc_buffers Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-api-mutex) R LogObjectManager::preproc_buffers Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-preproc) 0 buffers preprocessed from LogConfig 0x575b02275440 (refcount=2) this round Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-file) entering LogFile::write_ascii_logbuffer3 for stdout (this=0x575b022ce320) Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-fieldlist) Fieldlist for {X-Request-ID}cqh,chi,crc,ttms,cqtx,pssc,pscl found in cache, #0 Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-logbuffer) [0x79546880e010] Deleting buffer 32 at address 0x795468990000 Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 1 buffers Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-api-mutex) A LogObjectManager::preproc_buffers Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-api-mutex) R LogObjectManager::preproc_buffers Jul 13 00:23:25 [Jul 13 00:23:25.506] [LOG_PREPROC 0] DEBUG: (log-preproc) 1 buffers preprocessed from LogConfig 0x575b02275440 (refcount=2) this round Jul 13 00:23:25 [Jul 13 00:23:25.507] [LOG_FLUSH] DEBUG: (log-file) LogFile stdout is closed Jul 13 00:23:25 transfer_time_ms ttms sINT Jul 13 00:23:25 client_req_text cqtx STR Jul 13 00:23:25 proxy_resp_status_code pssc sINT Jul 13 00:23:25 proxy_resp_content_len pscl sINT Jul 13 00:23:25 [Jul 13 00:23:25.507] [LOG_FLUSH] DEBUG: (log-file) writing header to LogFile stdout Jul 13 00:23:25 -------------------------------------------------------- Jul 13 00:23:25 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Jul 13 00:23:25 LogObject [0x575b02211620]: format = common (0x575b022d9270) Jul 13 00:23:25 basename = stdout Jul 13 00:23:25 flags = 0 Jul 13 00:23:25 signature = 0 Jul 13 00:23:25 full path = stdout Jul 13 00:23:25 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Jul 13 00:23:25 ----------------------------- Jul 13 00:23:25 --- Logging Configuration --- Jul 13 00:23:25 ----------------------------- Jul 13 00:23:25 Config variables: Jul 13 00:23:25 log_buffer_size = 9216 Jul 13 00:23:25 max_secs_per_buffer = 1 Jul 13 00:23:25 max_space_mb_for_logs = 25000 Jul 13 00:23:25 max_space_mb_headroom = 1000 Jul 13 00:23:25 hostname = mtc-eu-par-1 Jul 13 00:23:25 logfile_dir = /var/log/trafficserver Jul 13 00:23:25 logfile_perm = 0644 Jul 13 00:23:25 error_log_filename = stderr Jul 13 00:23:25 [Jul 13 00:23:25.507] [LOG_FLUSH] DEBUG: (log) exiting LogFile::open_file(), file=stdout presumably open Jul 13 00:23:25 [Jul 13 00:23:25.507] [LOG_FLUSH] ERROR: Failed to write log to stdout: [tried 101, wrote 0, Bad file descriptor] Jul 13 00:23:25 preproc_threads = 1 Jul 13 00:23:25 rolling_enabled = 0 Jul 13 00:23:25 rolling_interval_sec = 86400 Jul 13 00:23:25 rolling_offset_hr = 0 Jul 13 00:23:25 rolling_size_mb = 10 Jul 13 00:23:25 rolling_min_count = 0 Jul 13 00:23:25 rolling_max_count = 0 Jul 13 00:23:25 rolling_allow_empty = 0 Jul 13 00:23:25 auto_delete_rolled_files = 1 Jul 13 00:23:25 sampling_frequency = 1 Jul 13 00:23:25 file_stat_frequency = 32 Jul 13 00:23:25 space_used_frequency = 2 Jul 13 00:23:25 logbuffer_max_iobuf_index = 8 Jul 13 00:23:25 Jul 13 00:23:25 ************ Log Objects (2 objects) ************ Jul 13 00:23:25 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Jul 13 00:23:25 LogObject [0x575b02211620]: format = common (0x575b022d9270) Jul 13 00:23:25 basename = stdout Jul 13 00:23:25 flags = 0 Jul 13 00:23:25 signature = 0 Jul 13 00:23:25 full path = stdout Jul 13 00:23:25 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Jul 13 00:23:25 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Jul 13 00:23:25 LogObject [0x575b02211530]: format = (null) (0x575b022dd240) Jul 13 00:23:25 basename = stderr Jul 13 00:23:25 flags = 8 Jul 13 00:23:25 signature = 0 Jul 13 00:23:25 full path = stderr Jul 13 00:23:25 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Jul 13 00:23:25 ************ Filter List (0 filters) ************ Jul 13 00:23:25 ************ Format List (1 formats) ************ Jul 13 00:23:25 -------------------------------------------------------- Jul 13 00:23:25 Format : common (LOG_FORMAT_CUSTOM) (0x575b0227d700), 7 fields. Jul 13 00:23:25 Symbols: {X-Request-ID}cqh,chi,crc,ttms,cqtx,pssc,pscl Jul 13 00:23:25 Fields : Jul 13 00:23:25 X-Request-ID cqh STR Jul 13 00:23:25 client_host_ip chi IP Jul 13 00:23:25 cache_result_code crc sINT Jul 13 00:23:25 transfer_time_ms ttms sINT Jul 13 00:23:25 client_req_text cqtx STR Jul 13 00:23:25 proxy_resp_status_code pssc sINT Jul 13 00:23:25 proxy_resp_content_len pscl sINT Jul 13 00:23:25 -------------------------------------------------------- Jul 13 00:23:26 [Jul 13 00:23:26.146] [ET_NET 2] DEBUG: (log) in DiagsLogContinuation, checking on diags.log Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_PREPROC 0] DEBUG: (log-logbuffer) prepared 0 buffers Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_PREPROC 0] DEBUG: (log-api-mutex) A LogObjectManager::preproc_buffers Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_PREPROC 0] DEBUG: (log-api-mutex) R LogObjectManager::preproc_buffers Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_PREPROC 0] DEBUG: (log-preproc) 0 buffers preprocessed from LogConfig 0x575b02275440 (refcount=2) this round Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (log-preproc) periodic tasks for 1657671806 Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (log-api-mutex) entering Log::periodic_tasks Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (log-periodic) Performing periodic tasks Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (log-periodic) Periodic task interval = 1 Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (logspace) 0 bytes being used for logs Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (logspace) 4171866112 bytes left on partition Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (logspace) logical space used 1048576000, configured space 26214400000, physical space left 3123290112, partition headroom 10485760, space is available Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::check_buffer_expiration Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::check_buffer_expiration Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (log-api-mutex) A LogObjectManager::roll_files Jul 13 00:23:26 [Jul 13 00:23:26.539] [LOG_FLUSH] DEBUG: (log-api-mutex) R LogObjectManager::roll_files Jul 13 00:23:27 [Jul 13 00:23:27.179] [ET_NET 2] DEBUG: (log) in DiagsLogContinuation, checking on diags.log ```
Tristan971 commented 2 years ago

If someone else encounters this, for now I'm just resorting to the old-and-dirty tricks of disabling log rotation and symlinking log files to /dev/stdout or /dev/stderr...

ie ln -s /dev/stdout /var/log/trafficserver/traffic.log etc

bryancall commented 2 years ago

@Tristan971 Is it possible for you to make a Docker container, so we can reproduce the issue? We haven't been able to reproduce the issue ourselves.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.