dmachard / go-dnscollector

Ingesting, pipelining, and enhancing your DNS logs with usage indicators, security analysis, and additional metadata.
MIT License
216 stars 48 forks source link

file logger: buffer is full, 5768 dnsmessage(s) dropped #837

Closed yadavp27 closed 2 weeks ago

yadavp27 commented 1 month ago

Hi,

Im using go-dnscollector v1.0.0, getting the warring message buffer is full, 5768 dnsmessage(s) dropped. tried setting the default values chan-buffer-size: 0 and 30720, below is the config file can some one suggest if any tweak is needed in the config file.

global:
  trace:
    verbose: true

pipelines:
  - name: tap
    dnstap:
      sock-path: /var/run/dnscollector/dnstap.sock
    routing-policy:
      forward: [ file ]

  - name: file
    logfile:
      file-path:  "/var/log/dnscollector/dnstap.log"
      compress: true
      max-size: 30720
      max-files: 7
      mode: json
      flush-interval: 10
      chan-buffer-size: 0
      text-format: "timestamp identity qr operation rcode queryip queryport responseip responseport family protocol length qname qtype latency ttl"

logs: Oct 07 18:06:16 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:06:16.724111 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 1596 dnsmessage(s) dropped Oct 07 18:06:26 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:06:26.724988 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 6556 dnsmessage(s) dropped Oct 07 18:06:36 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:06:36.725535 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 6724 dnsmessage(s) dropped Oct 07 18:06:46 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:06:46.726453 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5972 dnsmessage(s) dropped Oct 07 18:06:56 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:06:56.726685 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 7065 dnsmessage(s) dropped Oct 07 18:07:06 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:07:06.727064 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 6029 dnsmessage(s) dropped Oct 07 18:07:16 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:07:16.727741 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5829 dnsmessage(s) dropped Oct 07 18:07:26 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:07:26.728751 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 6174 dnsmessage(s) dropped Oct 07 18:07:36 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:07:36.729660 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5778 dnsmessage(s) dropped Oct 07 18:07:46 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:07:46.730720 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 6140 dnsmessage(s) dropped Oct 07 18:07:56 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:07:56.731152 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 6053 dnsmessage(s) dropped Oct 07 18:08:06 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:08:06.732230 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5895 dnsmessage(s) dropped Oct 07 18:08:16 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:08:16.732329 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 6083 dnsmessage(s) dropped Oct 07 18:08:26 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:08:26.732412 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 6050 dnsmessage(s) dropped Oct 07 18:08:36 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:08:36.732935 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5488 dnsmessage(s) dropped Oct 07 18:08:46 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:08:46.733642 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5914 dnsmessage(s) dropped Oct 07 18:08:56 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:08:56.733756 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 7116 dnsmessage(s) dropped Oct 07 18:09:06 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:09:06.733896 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5845 dnsmessage(s) dropped Oct 07 18:09:16 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:09:16.733973 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5587 dnsmessage(s) dropped Oct 07 18:09:26 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:09:26.734995 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5570 dnsmessage(s) dropped Oct 07 18:09:36 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:09:36.735098 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5593 dnsmessage(s) dropped Oct 07 18:09:46 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:09:46.735690 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5701 dnsmessage(s) dropped Oct 07 18:09:56 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:09:56.736378 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5875 dnsmessage(s) dropped Oct 07 18:10:06 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:10:06.736505 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 5680 dnsmessage(s) dropped Oct 07 18:10:16 dnscollector go-dnscollector[3564495]: WARNING: 2024/10/07 18:10:16.737712 worker - [tap] (conn https://github.com/dmachard/go-dnscollector/issues/1) dnstap processor - worker[file] buffer is full, 4175 dnsmessage(s) dropped

can some please help on this.

Thanks.

dmachard commented 1 month ago

Hi,

Could you try to update your config with the following settings ?

global:
  ...
  worker:
    buffer-size: 8192
  telemetry:
    enabled: false

....
  - name: file
    logfile:
      .....
      flush-interval: 1

and also use the new beta release https://github.com/dmachard/go-dnscollector/releases/tag/v1.1.0-beta2

yadavp27 commented 1 month ago

Hello @dmachard,

i Tested with the above changes, still seeing the same below is the config

global:
  trace:
    verbose: true
  worker:
    buffer-size: 8192
  telemetry:
    enabled: false

pipelines:
  - name: tap
    dnstap:
      sock-path: /var/run/dnscollector/dnstap.sock
    routing-policy:
      forward: [ file ]

  - name: file
    logfile:
      file-path:  "/var/log/dnscollector/dnstap.log"
      compress: true
      max-size: 30720
      max-files: 7
      mode: json
      flush-interval: 1
      chan-buffer-size: 0
      text-format: "timestamp identity qr operation rcode queryip queryport responseip responseport family protocol length qname qtype latency ttl"

logs:

Oct 09 23:16:46 dnscollector go-dnscollector[3660579]: WARNING: 2024/10/09 23:16:46.274899 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 5480 dnsmessage(s) dropped
Oct 09 23:16:56 dnscollector go-dnscollector[3660579]: WARNING: 2024/10/09 23:16:56.275917 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 5933 dnsmessage(s) dropped
Oct 09 23:17:06 dnscollector go-dnscollector[3660579]: WARNING: 2024/10/09 23:17:06.276921 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 5931 dnsmessage(s) dropped
Oct 09 23:17:16 dnscollector go-dnscollector[3660579]: WARNING: 2024/10/09 23:17:16.277991 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 6150 dnsmessage(s) dropped
Oct 09 23:17:26 dnscollector go-dnscollector[3660579]: WARNING: 2024/10/09 23:17:26.278450 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 6271 dnsmessage(s) dropped
Oct 09 23:17:36 dnscollector go-dnscollector[3660579]: WARNING: 2024/10/09 23:17:36.279367 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 5834 dnsmessage(s) dropped
Oct 09 23:17:46 dnscollector go-dnscollector[3660579]: WARNING: 2024/10/09 23:17:46.280017 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 5921 dnsmessage(s) dropped
Oct 09 23:17:56 dnscollector go-dnscollector[3660579]: WARNING: 2024/10/09 23:17:56.280257 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 5981 dnsmessage(s) dropped
dmachard commented 1 month ago

Are you using the v1.1.0-beta2 ? How much traffic load is being sent?"

yadavp27 commented 1 month ago

im using V1.0.0, 1083646 QPS.

dmachard commented 1 month ago

Please to try the v1.1.0-beta2 or the Head of the repo I pushed a fix to reduce the number of writes on disk

dmachard commented 1 month ago

Any feedback will be appreciated

yadavp27 commented 1 month ago

go-dnscollector -version 1.1.0-beta4

still the same Waring.

Oct 16 07:18:48 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:18:48.827204 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 1874 dnsmessage(s) dropped
Oct 16 07:18:58 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:18:58.827837 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3611 dnsmessage(s) dropped
Oct 16 07:19:08 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:19:08.828742 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3892 dnsmessage(s) dropped
Oct 16 07:19:18 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:19:18.828915 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3425 dnsmessage(s) dropped
Oct 16 07:19:28 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:19:28.829205 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 2918 dnsmessage(s) dropped
Oct 16 07:19:38 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:19:38.829754 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3746 dnsmessage(s) dropped
Oct 16 07:19:48 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:19:48.830708 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3562 dnsmessage(s) dropped
Oct 16 07:19:58 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:19:58.830855 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 2968 dnsmessage(s) dropped
Oct 16 07:20:08 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:20:08.831800 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3625 dnsmessage(s) dropped
Oct 16 07:20:18 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:20:18.833092 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 4460 dnsmessage(s) dropped
Oct 16 07:20:28 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:20:28.833395 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3642 dnsmessage(s) dropped
Oct 16 07:20:38 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:20:38.833711 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3425 dnsmessage(s) dropped
Oct 16 07:20:48 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:20:48.834768 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3519 dnsmessage(s) dropped
Oct 16 07:20:58 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:20:58.835572 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3371 dnsmessage(s) dropped
Oct 16 07:21:08 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:21:08.835905 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3431 dnsmessage(s) dropped
Oct 16 07:21:18 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:21:18.836343 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3414 dnsmessage(s) dropped
Oct 16 07:21:28 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:21:28.836897 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 2992 dnsmessage(s) dropped
Oct 16 07:21:38 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:21:38.837576 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3629 dnsmessage(s) dropped
Oct 16 07:21:48 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:21:48.838139 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3455 dnsmessage(s) dropped
Oct 16 07:21:58 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:21:58.839217 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3571 dnsmessage(s) dropped
Oct 16 07:22:08 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:22:08.839556 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3910 dnsmessage(s) dropped
Oct 16 07:22:18 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:22:18.840637 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3383 dnsmessage(s) dropped
Oct 16 07:22:28 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:22:28.841005 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 2817 dnsmessage(s) dropped
Oct 16 07:22:38 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:22:38.841566 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3709 dnsmessage(s) dropped
Oct 16 07:22:48 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:22:48.842575 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3609 dnsmessage(s) dropped
Oct 16 07:22:58 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:22:58.843503 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3476 dnsmessage(s) dropped
Oct 16 07:23:08 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:23:08.843589 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3981 dnsmessage(s) dropped
Oct 16 07:23:18 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:23:18.843724 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 3617 dnsmessage(s) dropped
Oct 16 07:23:28 dnscollector go-dnscollector[3839624]: WARNING: 2024/10/16 07:23:28.844797 worker - [tap] (conn #1) dnstap processor - worker[file] buffer is full, 1428 dnsmessage(s) dropped

config:

global:
  trace:
    verbose: true
  worker:
    buffer-size: 8192
  telemetry:
    enabled: false

pipelines:
  - name: tap
    dnstap:
      sock-path: /var/run/dnscollector/dnstap.sock
    routing-policy:
      forward: [ file ]

  - name: file
    logfile:
      file-path:  "/var/log/dnscollector/dnstap.log"
      compress: true
      max-size: 30720
      max-files: 7
      mode: json
      flush-interval: 1
      chan-buffer-size: 0
      text-format: "timestamp identity qr operation rcode queryip queryport responseip responseport family protocol length qname qtype latency ttl"
dmachard commented 1 month ago

Could you share more details regarding your server ( disk, ram and cpu)

yadavp27 commented 1 month ago
 lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           8

ram : 24gb

lshw -class disk -class storage
  *-ide
       description: IDE interface
       product: 82371AB/EB/MB PIIX4 IDE
       vendor: Intel Corporation
       physical id: 7.1
       bus info: pci@0000:00:07.1
       version: 01
       width: 32 bits
       clock: 33MHz
       capabilities: ide isa_compat_mode pci_native_mode bus_master
       configuration: driver=ata_piix latency=64
       resources: irq:0 ioport:1f0(size=8) ioport:3f6 ioport:170(size=8) ioport:376 ioport:2050(size=16)
  *-sas
       description: Serial Attached SCSI controller
       product: PVSCSI SCSI Controller
       vendor: VMware
       physical id: 0
       bus info: pci@0000:03:00.0
       logical name: scsi0
       version: 02
       width: 64 bits
       clock: 33MHz
       capabilities: sas pciexpress msi pm msix bus_master cap_list rom
       configuration: driver=vmw_pvscsi latency=0
       resources: irq:18 ioport:4000(size=8) memory:fea10000-fea17fff memory:fea00000-fea0ffff
     *-disk:0
          description: SCSI Disk
          product: Virtual disk
          vendor: VMware
          physical id: 0.0.0
          bus info: scsi@0:0.0.0
          logical name: /dev/sda
          version: 2.0
          size: 110GiB (118GB)
          capabilities: gpt-1.00 partitioned partitioned:gpt
          configuration: ansiversion=6 guid=bced0c82-4aa7-4c04-a612-fb4b63ebdcea logicalsectorsize=512 sectorsize=512
     *-disk:1
          description: SCSI Disk
          product: Virtual disk
          vendor: VMware
          physical id: 0.1.0
          bus info: scsi@0:0.1.0
          logical name: /dev/sdb
          version: 2.0
          size: 70GiB (75GB)
          capabilities: partitioned partitioned:dos
          configuration: ansiversion=6 logicalsectorsize=512 sectorsize=512 signature=36688089
dmachard commented 1 month ago

The beta4 improved the logger but it's not enough in your case. Can you enable the global telemetry and share your dashboard ?

image

I make more tests with and reproduced in my lab with ~70k req/s

yadavp27 commented 1 month ago

Hello @dmachard

This issue occurs when the logs are being rotated. The file is being renamed by dnscollector. During the rename and compress process, the dnstap.log file becomes unavailable and something happens.

To test this, I disabled the compress: false option during log rotation and did not observe any concerning issues.

Here is the output of the "ls -li" command:

total 45804092
136 -rw-r--r--. 1 dnscollector dnscollector  1991368893 Oct 19 05:09 dnstap-1729339380674388499.log.gz
137 -rw-r--r--. 1 dnscollector dnscollector  1852668699 Oct 20 15:42 dnstap-1729463780580942637.log.gz
138 -rw-r--r--. 1 dnscollector dnscollector  2192224271 Oct 21 06:57 dnstap-1729518723535040633.log.gz
139 -rw-r--r--. 1 dnscollector dnscollector  2124991919 Oct 22 01:01 dnstap-1729583707451467441.log.gz
133 -rw-r--r--. 1 dnscollector dnscollector  2163008098 Oct 22 19:09 dnstap-1729649384744867764.log.gz
134 -rw-r--r--. 1 dnscollector dnscollector  2219447496 Oct 24 02:52 dnstap-1729763534150940127.log.gz
131 -rw-r--r--. 1 dnscollector dnscollector 32212245632 Oct 24 21:57 dnstap-1729832230614616098.log
132 -rw-r--r--. 1 dnscollector dnscollector  1433690154 Oct 24 22:33 dnstap.log 
  - name: file
    logfile:
      file-path:  "/var/log/dnscollector/dnstap.log"
      compress: true
      max-size: 30720
      max-files: 7
      mode: json
      flush-interval: 1
      chan-buffer-size: 0
      text-format: "timestamp identity qr operation rcode queryip queryport responseip responseport family protocol length qname qtype latency ttl"

can you please suggest what could be done.

Thanks

dmachard commented 1 month ago

Thank you for your feedback. The compressed files are quite large in your case, around ~2GB each.

I've identified two potential actions:

dmachard commented 1 month ago

@yadavp27, could you try the beta release 1.2.0-beta1? I pushed a fix to prevent blocking during compression.

yadavp27 commented 1 month ago

@dmachard can i use this max-size: 30720 or default value 100 and test this out.

dmachard commented 1 month ago

You can keep your value (30720) but can you test with the beta release provided?

dmachard commented 2 weeks ago

Fixed in release 1.2.0. If you encounter any further issues, please feel free to open a new ticket. Thank you for your feedback!