logstash-plugins / logstash-input-beats

Apache License 2.0
86 stars 81 forks source link

"Beats Input: Remote connection closed" Connection::ConnectionClosed rapping: EOFError #53

Closed userguy closed 3 years ago

userguy commented 8 years ago

{:timestamp=>"2016-03-03T17:22:34.579000+0530", :message=>"Beats Input: Remote connection closed", :peer=>"IP:42715", :exception=>#<Lumberjack::Beats::Connection::ConnectionClosed: Lumberjack::Beats::Connection::ConnectionClosed wrapping: EOFError, End of file reached>, :level=>:warn}

This has only come after updating it to latest version ..

I have clean installed on other server ..What can be done for this .. Because if i am not able to ship in production .. It will be a difficult task

ph commented 8 years ago

@userguy I have a few question concerning your installation:

ph commented 8 years ago

I have removed this warning in https://github.com/ph/logstash-input-beats/commit/29ff875eb7954077755f38e0afb28a85c636b289 and is included in the logstash-input-beats 2.2.4

userguy commented 8 years ago

filebeat-1.1.1-1.x86_64 logstash-2.2.2-1.noarch logstash-codec-collectd (2.0.2) logstash-codec-dots (2.0.2) logstash-codec-edn (2.0.2) logstash-codec-edn_lines (2.0.2) logstash-codec-es_bulk (2.0.2) logstash-codec-fluent (2.0.2) logstash-codec-graphite (2.0.2) logstash-codec-json (2.1.0) logstash-codec-json_lines (2.1.1) logstash-codec-line (2.1.0) logstash-codec-msgpack (2.0.2) logstash-codec-multiline (2.0.9) logstash-codec-netflow (2.0.3) logstash-codec-oldlogstashjson (2.0.2) logstash-codec-plain (2.0.2) logstash-codec-rubydebug (2.0.5) logstash-filter-anonymize (2.0.2) logstash-filter-checksum (2.0.2) logstash-filter-clone (2.0.4) logstash-filter-csv (2.1.1) logstash-filter-date (2.1.2) logstash-filter-dns (2.0.2) logstash-filter-drop (2.0.2) logstash-filter-fingerprint (2.0.3) logstash-filter-geoip (2.0.5) logstash-filter-grok (2.0.3) logstash-filter-json (2.0.3) logstash-filter-kv (2.0.4) logstash-filter-metrics (3.0.0) logstash-filter-multiline (2.0.3) logstash-filter-mutate (2.0.3) logstash-filter-ruby (2.0.3) logstash-filter-sleep (2.0.2) logstash-filter-split (2.0.2) logstash-filter-syslog_pri (2.0.2) logstash-filter-throttle (2.0.2) logstash-filter-urldecode (2.0.2) logstash-filter-useragent (2.0.4) logstash-filter-uuid (2.0.3) logstash-filter-xml (2.1.1) logstash-input-beats (2.1.3) logstash-input-couchdb_changes (2.0.2) logstash-input-elasticsearch (2.0.3) logstash-input-eventlog (3.0.1) logstash-input-exec (2.0.4) logstash-input-file (2.2.1) logstash-input-ganglia (2.0.4) logstash-input-gelf (2.0.2) logstash-input-generator (2.0.2) logstash-input-graphite (2.0.5) logstash-input-heartbeat (2.0.2) logstash-input-http (2.2.0) logstash-input-http_poller (2.0.3) logstash-input-imap (2.0.3) logstash-input-irc (2.0.3) logstash-input-jdbc (3.0.0) logstash-input-kafka (2.0.4) logstash-input-log4j (2.0.5) logstash-input-lumberjack (2.0.5) logstash-input-pipe (2.0.2) logstash-input-rabbitmq (3.1.4) logstash-input-redis (2.0.2) logstash-input-s3 (2.0.4) logstash-input-snmptrap (2.0.2) logstash-input-sqs (2.0.3) logstash-input-stdin (2.0.2) logstash-input-syslog (2.0.2) logstash-input-tcp (3.0.2) logstash-input-twitter (2.2.0) logstash-input-udp (2.0.3) logstash-input-unix (2.0.4) logstash-input-xmpp (2.0.3) logstash-input-zeromq (2.0.2) logstash-output-cloudwatch (2.0.2) logstash-output-csv (2.0.3) logstash-output-elasticsearch (2.5.1) logstash-output-email (3.0.2) logstash-output-exec (2.0.2) logstash-output-file (2.2.3) logstash-output-ganglia (2.0.2) logstash-output-gelf (2.0.3) logstash-output-graphite (2.0.3) logstash-output-hipchat (3.0.2) logstash-output-http (2.1.1) logstash-output-irc (2.0.2) logstash-output-juggernaut (2.0.2) logstash-output-kafka (2.0.2) logstash-output-lumberjack (2.0.4) logstash-output-nagios (2.0.2) logstash-output-nagios_nsca (2.0.3) logstash-output-null (2.0.2) logstash-output-opentsdb (2.0.2) logstash-output-pagerduty (2.0.2) logstash-output-pipe (2.0.2) logstash-output-rabbitmq (3.0.7) logstash-output-redis (2.0.2) logstash-output-s3 (2.0.4) logstash-output-sns (3.0.2) logstash-output-sqs (2.0.2) logstash-output-statsd (2.0.5) logstash-output-stdout (2.0.4) logstash-output-tcp (2.0.2) logstash-output-udp (2.0.2) logstash-output-xmpp (2.0.2) logstash-output-zeromq (2.0.2) logstash-patterns-core (2.0.2)

can you push it in logstash-2.2.2-1.noarch - So that we do not have to update it separately as updating it from proxy is another challenge as it says connection refused

No there is not loadbalancer in between

ph commented 8 years ago

I cannot push a new release of noarch with the updated plugins it will come in our next official release. Have you tried to use the bin/plugin pack command to generated a new pack of the updated plugin with a machine with DMZ access to the internet?

https://www.elastic.co/guide/en/logstash/current/offline-plugins.html

Also you did not answer me, are you running logstash behind a proxy or not? Are you getting a lot of EOFError error in your log? I am trying to find why this is currently happening for you.

userguy commented 8 years ago

As of Now yes logs are not at all moving to logstash from beats .. It just gives me EOF error ..No there is no proxy .. It is just three machines one with filebeat , logstash , elasticsearch .. As of now for offline i am facing issues with offline update ..

ph commented 8 years ago

We can rule out the health check from a proxy.

Any error in the filebeat logs? can you paste your configuration from both logstash and filebeat? It might be a configuration issue.

userguy commented 8 years ago
################### Filebeat Configuration Example #########################

############################# Filebeat ######################################
filebeat:
  # List of prospectors to fetch data.
  prospectors:
    # Each - is a prospector. Below are the prospector specific configurations
    -
      # Paths that should be crawled and fetched. Glob based paths.
      # To fetch all ".log" files from a specific level of subdirectories
      # /var/log/*/*.log can be used.
      # For each file found under this path, a harvester is started.
      # Make sure not file is defined twice as this can lead to unexpected behav                                                                                        iour.
      paths:
        - /opt/prod.log
        #- c:\programdata\elasticsearch\logs\*

      # Configure the file encoding for reading files with international charact                                                                                        ers
      # following the W3C recommendation for HTML5 (http://www.w3.org/TR/encodin                                                                                        g).
      # Some sample encodings:
      #   plain, utf-8, utf-16be-bom, utf-16be, utf-16le, big5, gb18030, gbk,
      #    hz-gb-2312, euc-kr, euc-jp, iso-2022-jp, shift-jis, ...
      #encoding: plain

      # Type of the files. Based on this the way the file is read is decided.
      # The different types cannot be mixed in one prospector
      #
      # Possible options are:
      # * log: Reads every line of the log file (default)
      # * stdin: Reads the standard in
      input_type: log

      # Exclude lines. A list of regular expressions to match. It drops the line                                                                                        s that are
      # matching any regular expression from the list. The include_lines is call                                                                                        ed before
      # exclude_lines. By default, no lines are dropped.
      # exclude_lines: ["^DBG"]

      # Include lines. A list of regular expressions to match. It exports the li                                                                                        nes that are
      # matching any regular expression from the list. The include_lines is call                                                                                        ed before
      # exclude_lines. By default, all the lines are exported.
      # include_lines: ["^ERR", "^WARN"]

      # Exclude files. A list of regular expressions to match. Filebeat drops th                                                                                        e files that
      # are matching any regular expression from the list. By default, no files                                                                                         are dropped.
      # exclude_files: [".gz$"]

      # Optional additional fields. These field can be freely picked
      # to add additional information to the crawled log files for filtering
      #fields:
      #  level: debug
      #  review: 1

      # Set to true to store the additional fields as top level fields instead
      # of under the "fields" sub-dictionary. In case of name conflicts with the
      # fields added by Filebeat itself, the custom fields overwrite the default
      # fields.
      #fields_under_root: false

      # Ignore files which were modified more then the defined timespan in the p                                                                                        ast
      # Time strings like 2h (2 hours), 5m (5 minutes) can be used.
      #ignore_older: 24h

      # Type to be published in the 'type' field. For Elasticsearch output,
      # the type defines the document type these entries should be stored
      # in. Default: log
      #document_type: log

      # Scan frequency in seconds.
      # How often these files should be checked for changes. In case it is set
      # to 0s, it is done as often as possible. Default: 10s
      #scan_frequency: 10s

      # Defines the buffer size every harvester uses when fetching the file
      #harvester_buffer_size: 16384

      # Maximum number of bytes a single log event can have
      # All bytes after max_bytes are discarded and not sent. The default is 10M                                                                                        B.
      # This is especially useful for multiline log messages which can get large                                                                                        .
      #max_bytes: 10485760

      # Mutiline can be used for log messages spanning multiple lines. This is c                                                                                        ommon
      # for Java Stack Traces or C-Line Continuation
      #multiline:

        # The regexp Pattern that has to be matched. The example pattern matches                                                                                         all lines starting with [
        #pattern: ^\[

        # Defines if the pattern set under pattern should be negated or not. Def                                                                                        ault is false.
        #negate: false

        # Match can be set to "after" or "before". It is used to define if lines                                                                                         should be append to a pattern
        # that was (not) matched before or after or as long as a pattern is not                                                                                         matched based on negate.
        # Note: After is the equivalent to previous and before is the equivalent                                                                                         to to next in Logstash
        #match: after

        # The maximum number of lines that are combined to one event.
        # In case there are more the max_lines the additional lines are discarde                                                                                        d.
        # Default is 500
        #max_lines: 500

        # After the defined timeout, an multiline event is sent even if no new p                                                                                        attern was found to start a new event
        # Default is 5s.
        #timeout: 5s

      # Setting tail_files to true means filebeat starts readding new files at t                                                                                        he end
      # instead of the beginning. If this is used in combination with log rotati                                                                                        on
      # this can mean that the first entries of a new file are skipped.
      #tail_files: false

      # Backoff values define how agressively filebeat crawls new files for upda                                                                                        tes
      # The default values can be used in most cases. Backoff defines how long i                                                                                        t is waited
      # to check a file again after EOF is reached. Default is 1s which means th                                                                                        e file
      # is checked every second if new lines were added. This leads to a near re                                                                                        al time crawling.
      # Every time a new line appears, backoff is reset to the initial value.
      #backoff: 1s

      # Max backoff defines what the maximum backoff time is. After having backe                                                                                        d off multiple times
      # from checking the files, the waiting time will never exceed max_backoff                                                                                         idenependent of the
      # backoff factor. Having it set to 10s means in the worst case a new line                                                                                         can be added to a log
      # file after having backed off multiple times, it takes a maximum of 10s t                                                                                        o read the new line
      #max_backoff: 10s

      # The backoff factor defines how fast the algorithm backs off. The bigger                                                                                         the backoff factor,
      # the faster the max_backoff value is reached. If this value is set to 1,                                                                                         no backoff will happen.
      # The backoff value will be multiplied each time with the backoff_factor u                                                                                        ntil max_backoff is reached
      #backoff_factor: 2

      # This option closes a file, as soon as the file name changes.
      # This config option is recommended on windows only. Filebeat keeps the fi                                                                                        les it's reading open. This can cause
      # issues when the file is removed, as the file will not be fully removed u                                                                                        ntil also Filebeat closes
      # the reading. Filebeat closes the file handler after ignore_older. During                                                                                         this time no new file with the
      # same name can be created. Turning this feature on the other hand can lea                                                                                        d to loss of data
      # on rotate files. It can happen that after file rotation the beginning of                                                                                         the new
      # file is skipped, as the reading starts at the end. We recommend to leave                                                                                         this option on false
      # but lower the ignore_older value to release files faster.
      #force_close_files: false

    # Additional prospector
    #-
      # Configuration to use stdin input
      #input_type: stdin

  # General filebeat configuration options
  #
  # Event count spool threshold - forces network flush if exceeded
  #spool_size: 2048

  # Defines how often the spooler is flushed. After idle_timeout the spooler is
  # Flush even though spool_size is not reached.
  #idle_timeout: 5s

  # Name of the registry file. Per default it is put in the current working
  # directory. In case the working directory is changed after when running
  # filebeat again, indexing starts from the beginning again.
  registry_file: /var/lib/filebeat/registry

  # Full Path to directory with additional prospector configuration files. Each                                                                                         file must end with .yml
  # These config files must have the full filebeat config part inside, but only
  # the prospector part is processed. All global options like spool_size are ign                                                                                        ored.
  # The config_dir MUST point to a different directory then where the main fileb                                                                                        eat config file is in.
  #config_dir:

###############################################################################
############################# Libbeat Config ##################################
# Base config file used by all other beats for using libbeat features

############################# Output ##########################################

# Configure what outputs to use when sending the data collected by the beat.
# Multiple outputs may be used.
output:

  ### Elasticsearch as output
#  elasticsearch:
    # Array of hosts to connect to.
    # Scheme and port can be left out and will be set to the default (http and 9                                                                                        200)
    # In case you specify and additional path, the scheme is required: http://lo                                                                                        calhost:9200/path
    # IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
#    hosts: ["localhost:9200"]

    # Optional protocol and basic auth credentials.
    #protocol: "https"
    #username: "admin"
    #password: "s3cr3t"

    # Number of workers per Elasticsearch host.
    #worker: 1

    # Optional index name. The default is "filebeat" and generates
    # [filebeat-]YYYY.MM.DD keys.
    #index: "filebeat"

    # Optional HTTP Path
    #path: "/elasticsearch"

    # Proxy server url
    #proxy_url: http://proxy:3128

    # The number of times a particular Elasticsearch index operation is attempte                                                                                        d. If
    # the indexing operation doesn't succeed after this many retries, the events                                                                                         are
    # dropped. The default is 3.
    #max_retries: 3

    # The maximum number of events to bulk in a single Elasticsearch bulk API in                                                                                        dex request.
    # The default is 50.
    #bulk_max_size: 50

    # Configure http request timeout before failing an request to Elasticsearch.
    #timeout: 90

    # The number of seconds to wait for new events between two bulk API index re                                                                                        quests.
    # If `bulk_max_size` is reached before this interval expires, addition bulk                                                                                         index
    # requests are made.
    #flush_interval: 1

    # Boolean that sets if the topology is kept in Elasticsearch. The default is
    # false. This option makes sense only for Packetbeat.
    #save_topology: false

    # The time to live in seconds for the topology information that is stored in
    # Elasticsearch. The default is 15 seconds.
    #topology_expire: 15

    # tls configuration. By default is off.
    #tls:
      # List of root certificates for HTTPS server verifications
      #certificate_authorities: ["/etc/pki/root/ca.pem"]

      # Certificate for TLS client authentication
      #certificate: "/etc/pki/client/cert.pem"

      # Client Certificate Key
      #certificate_key: "/etc/pki/client/cert.key"

      # Controls whether the client verifies server certificates and host name.
      # If insecure is set to true, all server host names and certificates will                                                                                         be
      # accepted. In this mode TLS based connections are susceptible to
      # man-in-the-middle attacks. Use only for testing.
      #insecure: true

      # Configure cipher suites to be used for TLS connections
      #cipher_suites: []

      # Configure curve types for ECDHE based cipher suites
      #curve_types: []

      # Configure minimum TLS version allowed for connection to logstash
      #min_version: 1.0

      # Configure maximum TLS version allowed for connection to logstash
      #max_version: 1.2

  ### Logstash as output
  logstash:
    # The Logstash hosts
    hosts: ["IP:5044"]

    # Number of workers per Logstash host.
    #worker: 1

    # Set gzip compression level.
    #compression_level: 3

    # Optional load balance the events between the Logstash hosts
    #loadbalance: true

    # Optional index name. The default index name depends on the each beat.
    # For Packetbeat, the default is set to packetbeat, for Topbeat
    # top topbeat and for Filebeat to filebeat.
    #index: filebeat

    # Optional TLS. By default is off.
    #tls:
      # List of root certificates for HTTPS server verifications
      #certificate_authorities: ["/etc/pki/root/ca.pem"]

      # Certificate for TLS client authentication
      #certificate: "/etc/pki/client/cert.pem"

      # Client Certificate Key
      #certificate_key: "/etc/pki/client/cert.key"

      # Controls whether the client verifies server certificates and host name.
      # If insecure is set to true, all server host names and certificates will                                                                                         be
      # accepted. In this mode TLS based connections are susceptible to
      # man-in-the-middle attacks. Use only for testing.
      #insecure: true

      # Configure cipher suites to be used for TLS connections
      #cipher_suites: []

      # Configure curve types for ECDHE based cipher suites
      #curve_types: []

  ### File as output
  #file:
    # Path to the directory where to save the generated files. The option is man                                                                                        datory.
    #path: "/tmp/filebeat"

    # Name of the generated files. The default is `filebeat` and it generates fi                                                                                        les: `filebeat`, `filebeat.1`, `filebeat.2`, etc.
    #filename: filebeat

    # Maximum size in kilobytes of each file. When this size is reached, the fil                                                                                        es are
    # rotated. The default value is 10 MB.
    #rotate_every_kb: 10000

    # Maximum number of files under path. When this number of files is reached,                                                                                         the
    # oldest file is deleted and the rest are shifted from last to first. The de                                                                                        fault
    # is 7 files.
    #number_of_files: 7

  ### Console output
  # console:
    # Pretty print json event
    #pretty: false

############################# Shipper #########################################

shipper:
  # The name of the shipper that publishes the network data. It can be used to g                                                                                        roup
  # all the transactions sent by a single shipper in the web interface.
  # If this options is not defined, the hostname is used.
  #name:

  # The tags of the shipper are included in their own field with each
  # transaction published. Tags make it easy to group servers by different
  # logical properties.
  #tags: ["service-X", "web-tier"]

  # Uncomment the following if you want to ignore transactions created
  # by the server on which the shipper is installed. This option is useful
  # to remove duplicates if shippers are installed on multiple servers.
  #ignore_outgoing: true

  # How often (in seconds) shippers are publishing their IPs to the topology map                                                                                        .
  # The default is 10 seconds.
  #refresh_topology_freq: 10

  # Expiration time (in seconds) of the IPs published by a shipper to the topolo                                                                                        gy map.
  # All the IPs will be deleted afterwards. Note, that the value must be higher                                                                                         than
  # refresh_topology_freq. The default is 15 seconds.
  #topology_expire: 15

  # Internal queue size for single events in processing pipeline
  #queue_size: 1000

  # Configure local GeoIP database support.
  # If no paths are not configured geoip is disabled.
  #geoip:
    #paths:
    #  - "/usr/share/GeoIP/GeoLiteCity.dat"
    #  - "/usr/local/var/GeoIP/GeoLiteCity.dat"

############################# Logging #########################################

# There are three options for the log ouput: syslog, file, stderr.
# Under Windos systems, the log files are per default sent to the file output,
# under all other system per default to syslog.
logging:

  # Send all logging output to syslog. On Windows default is false, otherwise
  # default is true.
  #to_syslog: true

  # Write all logging output to files. Beats automatically rotate files if rotat                                                                                        eeverybytes
  # limit is reached.
  #to_files: false

  # To enable logging to files, to_files option has to be set to true
  files:
    # The directory where the log files will written to.
    #path: /var/log/mybeat

    # The name of the files where the logs are written to.
    #name: mybeat

    # Configure log file size limit. If limit is reached, log file will be
    # automatically rotated
    rotateeverybytes: 10485760 # = 10MB

    # Number of rotated log files to keep. Oldest files will be deleted first.
    #keepfiles: 7

  # Enable debug output for selected components. To enable all selectors use ["*                                                                                        "]
  # Other available selectors are beat, publish, service
  # Multiple selectors can be chained.
  #selectors: [ ]

  # Sets log level. The default log level is error.
  # Available log levels are: critical, error, warning, info, debug
  #level: error

input {
 beats {
   port => 5044
  }
}
userguy commented 8 years ago

Also can you help me how can i enable logs in filebeat ???

ruflin commented 8 years ago

@userguy To enable logging in filebeat: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-logging.html

userguy commented 8 years ago

@ph any input ??? by that time i will try to update via offline packagage

ph commented 8 years ago

@userguy From your configuration files, you seems to use the plain tcp connection between filebeat and the inputs. The configuration seems fine to me, did you try to add debug logging in filebeat? It would be useful for us to see what filebeat is doing.

sunilmchaudhari commented 8 years ago

@userguy , I was facing same problem with same use case, filebeat->logstash->ES. I found that tsl option should be enabled in filebeat configuration. You might be setting ssl=> true in your logstashs beats input and at filebeat side tls option is disabled. Try enabling it.

jsvd commented 3 years ago

closing this issue as this plugin's internals have changed a lot since then. Please create a new issue even if you come across a similar problem.