Closed apgiorgi closed 1 month ago
Hello!
thanks for reaching out, flushing behavior is currently as follows:
Questions:
PCAP_ROTATE)SECS
Thanks @gchux !
The use case is to capture the very last request that results on a SIGTERM. So the timing is tight.
Currently, the PCAP_ROTATE_SECS
used the default 60 seconds. Next test is to set it with 5 seconds (or less), and try to have those files flushed before instance termination, maybe adding some delay on the SIGTERM handler of the application. Nevertheless, It would be nice if tcpdumpw
(or pcap-cli
?) could handle the SIGTERM and guarantee that the last capture get flushed correctly, but I'm not sure if it's possible at all (can you send a signal to tcpdump?)
The envs defined:
env:
- name: PCAP_IFACE
value: eth
- name: PCAP_GCS_BUCKET
value: pnp-eu-tcpdump
- name: PCAP_FILTER
value: tcp
Here are the arguments:
iface:
use_cron:false
cron_exp:-
timezone:UTC
timeout:0
extension:pcap
directory:/pcap-tmp
snaplen:0
filter:TCP
interval:60
tcpdump:true
jsondump:false
jsonlog:false
ordered:false
hc_port:12345
gcp_gae:false
hello @apgiorgi
I've reviewed this behavior in depth, and after reviewing this behavior with @thomasmburke we have defined a path for exporting remaining PCAP files
on SIGTERM
:
init
script and propagate it: https://github.com/gchux/cloud-run-tcpdump/blob/v1.0.60-RC4/scripts/init#L101-L108SIGTERM
at GCS FUSE process holder and propagate it with 8s
delay to allow for remaining PCAP files
to be fully exported: https://github.com/gchux/cloud-run-tcpdump/blob/v1.0.60-RC4/scripts/start_gcsfuse#L6PCAP files
flushing on file system notifier: https://github.com/gchux/cloud-run-tcpdump/blob/v1.0.60-RC4/pcap-fsnotify/main.go#L414So the issue was that the sidecar's entrypoint was not properly propagating the signal to the supervisor process.
In addition to that, we have enabled flushing files concurrently on receiving the SIGTERM
from Cloud Run.
see commit: https://github.com/gchux/cloud-run-tcpdump/commit/e4c4ad10a10395c3371cef2e8aec22f988c728c5
feel free to test it out using: ghcr.io/gchux/cloud-run-tcpdump:v1.0.60-RC4
here's a sample of how it looks like in action:
ghcr.io/gchux/cloud-run-tcpdump:v1.0.60-RC5
is now available with improved logging for PCAP files flushing.
Also, I don't see the log output for "signaled:", as I would expect from https://github.com/gchux/cloud-run-tcpdump/blob/main/tcpdumpw/main.go#L404
Would there be a way to guarantee that
tcpdump
terminates cleanly and flushes the pcap to GCS before the final instance shutdown?