bzikarsky / gelf-php

A php implementation to send log-files to a gelf compatible backend like graylog
MIT License
410 stars 85 forks source link

Some udp message are not written #117

Closed devployment closed 5 years ago

devployment commented 5 years ago

Trying to send log messages over udp seems to fail for some message. The failure seems to be related to context data.

If I remove data from the context the message is sent. Adding back some data makes the sending fail.

Fail in this case means no issue that I could see anywhere. I just debugged it with Wireshark, to check what leaves my machine and what not.

The easiest way to reproduce was to add json_encode($_SERVER) as context data, by using Monolog.

// Nothing will be captured in Wireshark
$log->debug("This will fail", ["server_data" => json_encode($_SERVER)]);

// Sending will be captured in Wireshark
$log->debug("This will be fine", ["server_data" => ["foo" => "bar"]);

Using other data structures with json_encode works fine. So it might be somehow related to the data itself. Currently no idea what goes wrong.

devployment commented 5 years ago

Seems to be related to the length of the context data.

If I try to log the following as context it works

$serverData = [
    "PHP_EXTRA_CONFIGURE_ARGS"                  => "--enable-fpm --with-fpm-user=www-data --with-fpm-group=www-data --disable-cgi",
    "HOSTNAME"                                  => "a0648331a174",
    "PHP_INI_DIR"                               => "/usr/local/etc/php",
    "HOME"                                      => "/var/www",
    "INSTALL_IMAGEMAGICK"                       => "false",
    "PHP_IDE_CONFIG"                            => "serverName=laradock",
    "PHP_LDFLAGS"                               => "-Wl,-O1 -Wl,--hash-style=both -pie",
    "PHP_CFLAGS"                                => "-fstack-protector-strong -fpic -fpie -O2",
    "PHP_MD5"                                   => "",
    "PHP_VERSION"                               => "5.6.40",
    "GPG_KEYS"                                  => "XXXXXX5F97500D450838XXXXXX857D9A90XXXXXX XXXXXXXB321FDC07F2XXXXXXX3AC2BF0BC433XXXXXX",
    "INSTALL_MSSQL"                             => "false",
    "PHP_CPPFLAGS"                              => "-fstack-protector-strong -fpic -fpie -O2",
    "PHP_ASC_URL"                               => "https://secure.php.net/get/php-5.6.40.tar.xz.asc/from/this/mirror",
    "INSTALL_AEROSPIKE"                         => "false",
    "PHP_URL"                                   => "https://secure.php.net/get/php-5.6.40.tar.xz/from/this/mirror",
    "PATH"                                      => "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
    "INSTALL_IMAP"                              => "false",
    "PHPIZE_DEPS"                               => "autoconf        dpkg-dev        file        g++         gcc         libc-dev        make        pkg-config      re2c",
    "PWD"                                       => "/var/www",
    "PHP_SHA256"                                => "XXXXXX1eee3995d7fbd1c5342e5ccXXXXXXe276d561595b6052b21ace2XXXXXX",
    "INSTALL_IMAGE_OPTIMIZERS"                  => "false",
    "DB_1_ENV_MYSQL_PASSWORD"                   => "",
    "DB_1_ENV_MYSQL_USER"                       => "",
    "DB_1_ENV_MYSQL_DATABASE"                   => "",
    "USER"                                      => "www-data",
    "FCGI_ROLE"                                 => "RESPONDER",
    "askapache"                                 => "",
    "HTTP_HOST"                                 => "local.test",
    "HTTP_USER_AGENT"                           => "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:69.0) Gecko/20100101 Firefox/69.0",
    "HTTP_ACCEPT"                               => "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "HTTP_ACCEPT_LANGUAGE"                      => "de,en-US;q=0.7,en;q=0.3",
    "HTTP_ACCEPT_ENCODING"                      => "gzip, deflate",
    "HTTP_CONNECTION"                           => "keep-alive",
    "HTTP_COOKIE"                               => "cookieconsent_status=dismiss; TP=9014471; PHPSESSID=ie8akdgbo13kb8tlrj7uke84a5; C=0; QUICKSTART=j017s0aimbs9h64ngajbcmo855; portainer.UI_STATE=%7B%22dismissedInfoPanels%22%3A%7B%7D%2C%22dismissedInfoHash%22%3A%22xpdlpJcnPgN9%2BEK1Usal2A%3D%3D%22%7D; toggle=true",
    "HTTP_UPGRADE_INSECURE_REQUESTS"            => "1",
    "HTTP_CACHE_CONTROL"                        => "max-age=0",
    "SERVER_SIGNATURE"                          => "Apache/2.4.18 (Ubuntu) Server at local.test Port 80",
    "SERVER_SOFTWARE"                           => "Apache/2.4.18 (Ubuntu)",
    "SERVER_NAME"                               => "local.test",
    "SERVER_ADDR"                               => "172.18.0.8",
    "SERVER_PORT"                               => "80",
    "REMOTE_ADDR"                               => "172.18.0.1",
    "DOCUMENT_ROOT"                             => "/var/www/",
    "REQUEST_SCHEME"                            => "http",
    "CONTEXT_PREFIX"                            => "",
    "CONTEXT_DOCUMENT_ROOT"                     => "/var/www/",
    "SERVER_ADMIN"                              => "no address given",
    "SCRIPT_FILENAME"                           => "/var/www/index.php",
];

// All good!
$log->debug("Server Data", ["server_data" => $serverData]);

As soon I add the next entry it just silently fails.

$serverData = [
   "All From Above"                                    => "...",
   "SERVER_PROTOCOL"                           => "HTTP/1.1",
];

// Silently fails
$log->debug("Server Data", ["server_data" => $serverData]);

Any ideas? Anybody?

bzikarsky commented 5 years ago

A few questions:

devployment commented 5 years ago

Whats the error you get?

No error at all. That's the weird part

How long (number of bytes) is the failing encoded context?

It seems to fail at about 1452 bytes. The number is not constant. But always in this area.

How is your transport configured? Especially in regards to the packet size?

I deactivated chunking for testing. Seems like chunked messages are not received at all. Experimented with all different sizes.

$gelfTransport = new Gelf\Transport\UdpTransport($_ENV['GELF_HOST'], $_ENV['GELF_PORT'], 0);

It might be related to the receiving server, and or the network in between. Currently investigating in this direction as there is no obvious error from the library.

bzikarsky commented 5 years ago

Since there is no error from the library, I assume that the library does not get any feedback that something went wrong with the UDP send.

If I had to guess I'd say something is wrong with the MTU somewhere along the network path. 1452 feels suspiciously close to the ethernet default of 1500. Can you enable chunking with a significantly smaller size (e.g. 500) and see if this works?

You can also run netstat -anus on the participating machines and look for UDP errors. Maybe this helps.

devployment commented 5 years ago

I was not able to find anything. Meanwhile I switched to TCP and this works for now without issues.

If I return to UDP and fins something related I'll update this here just in case.

mathielen commented 5 years ago

Hi there, I have the exact same problem. Also the problematic size is around 1500.

  1. I tried to send small data in multiple chunks of 100 bytes. That did succeed.
  2. So I tried to send some big (3k) data in chunks of 100 bytes. That did succeed as well.

Turns out, the split_length of str_split in sendMessageInChunks is too big. The chunk is too big here, as later on all the magic bytes for udp-gelf are added which results in a chunk that is > chunkSize.

A quick hack, reserving 15 bytes for the gelf-magic: $chunks = str_split($rawMessage, $this->chunkSize-15); => that works.

bzikarsky commented 5 years ago

Yeah, this is true. It probably didn't surface yet, because one could adjust the chunk-size manually. But good catch! Thanks.

Though I diasgree with 15bytes - I count 12:

mathielen commented 5 years ago

Please have a look at: https://github.com/bzikarsky/gelf-php/pull/118#issuecomment-517715427

bzikarsky commented 5 years ago

I will reinvestigate. Feels like an off-by-one-error.

If you want to support the cause, you could add your failing tests in a PR. :-)

bzikarsky commented 5 years ago

Do you know your network's MTU?

Because we have to differentiate between 2 things:

a) gelf-php creating too big packets, where "too big" means that somehow a UDP packet with a size bigger than $chunkSize is sent

b) the packet not reaching its destination because the $chunkSize and additional non-Gelf overhead (e.g. UDP header) exceeds the MTU somewhere.

The initial bug was related to a) not taking the GELF chunk overhead into account. But you may also have a problem with b). Your chunkSize needs to be smaller or equal to the MTU minus at least the UDP overhead (which is AFAIR four 2byte.fields -> 8bytes).

mathielen commented 5 years ago

Hey @bzikarsky ,

thanks for looking into it! The MTU is set to 1500. In my case I have changed the chunk size to 1400 and everything looks good so far. Btw I am sending data to a graylog server instance through an openvpn tunnel. So there are lots of other possible causes and thresholds that may prevent the data from reaching its destination.

Anyway I am happy with the current implementation and you can always set your own chunk size through the constructor, so I wouldnt mind.

Keep up the good work 👍 Markus

bzikarsky commented 5 years ago

Ha! OpenVPN! This will eat into your MTU as well. I guess OpenVPN does not properly know about your network's MTU and will not correctly transfer our packets via the wire. maybe this article is of interest for you: https://www.sonassi.com/help/troubleshooting/setting-correct-mtu-for-openvpn

mathielen commented 5 years ago

wooow! Awesome info. Turns out the MTU within the VPN is 1419. Obviously! Ok, now that makes a lot of sense after all :) Thank you!

bzikarsky commented 5 years ago

You are very welcome. :)