logstash-plugins / logstash-output-google_cloud_storage

Apache License 2.0
9 stars 26 forks source link

Deleted files remain in use by the system eventually filling up disk space #19

Closed arslanm closed 6 years ago

arslanm commented 6 years ago

We're using a slightly modified version of this plugin (3.0.4) in our infrastructure. There are two differences:

just below

message = LogStash::Json.dump(event.to_hash)

on line 149, we have:

message = message.gsub(/\"@(\w+)\"/, '"\1"')

to remove "@" sign from the field names to make Bigquery happy. The second difference is the filename format. Instead of doing this in get_base_path:

  def get_base_path
    return @temp_directory + File::SEPARATOR + @log_file_prefix + "_" +
      Socket.gethostname() + "_" + Time.now.strftime(@date_pattern)
  end

we do this:

  def get_base_path
    return @temp_directory + File::SEPARATOR + @log_file_prefix + "_" +
      Time.now.strftime(@date_pattern) + "_" + Socket.gethostname()
  end

so it is possible to use prefix on Google GCS with a single wildcard to load daily files into Google Bigquery -- i.e., instead of "prefix_hostname_YYYY-MM-DD" file names will have the pattern "prefix_YYYY-MM-DD_hostname" which makes it possible to use a single wildcard to load into BQ when there are multiple logstash instances.

Ex: prefixYYYY-MM-DD which works vs prefix_hostYYYY-MM-DD* which does not)

  google_cloud_storage {
    bucket => "mybucket"
    key_path => "/ssl/key"
    service_account =>"service@account"
    temp_directory => "/tmp/logstash_storage"
    date_pattern => "%Y-%m-%d"
    flush_interval_secs => 2
    uploader_interval_secs => 180
    max_file_size_kbytes => 102400
    gzip => true
    output_format => "json"
    codec => "json"
  }

The instances that run logstash with this plugin eventually run out of disk space. When I run lsof I get this:

$ sudo lsof | grep deleted | grep shipper
java 28456 logstash 127r REG 8,1 104859155 6242 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part012.log.gz (deleted)
java 28456 logstash 139r REG 8,1 104862422 8797 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part013.log.gz (deleted)
java 28456 logstash 154r REG 8,1 104860142 13306 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part014.log.gz (deleted)
java 28456 logstash 155r REG 8,1 104865750 22383 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part008.log.gz (deleted)
java 28456 logstash 156r REG 8,1 104864309 13003 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part010.log.gz (deleted)
java 28456 logstash 157r REG 8,1 104864216 12593 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part011.log.gz (deleted)
java 28456 logstash 158r REG 8,1 104860763 4211 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part016.log.gz (deleted)
java 28456 logstash 162r REG 8,1 104859803 13363 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part017.log.gz (deleted)
java 28456 logstash 163r REG 8,1 104865470 3907 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part009.log.gz (deleted)
java 28456 logstash 164r REG 8,1 104858908 22297 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part019.log.gz (deleted)
java 28456 logstash 165r REG 8,1 104860501 8871 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part015.log.gz (deleted)
java 28456 logstash 166r REG 8,1 104861010 22365 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part024.log.gz (deleted)
java 28456 logstash 167r REG 8,1 104860015 12764 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part021.log.gz (deleted)
java 28456 logstash 168r REG 8,1 104859781 11550 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part018.log.gz (deleted)
java 28456 logstash 170r REG 8,1 104861647 13002 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part020.log.gz (deleted)
java 28456 logstash 172r REG 8,1 104860524 22301 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part022.log.gz (deleted)
java 28456 logstash 173r REG 8,1 104857893 5064 /tmp/logstash_storage/logstash_gcs_2018-02-08_logstash_shipper_hostname.part023.log.gz (deleted)
$
josephlewis42 commented 6 years ago

I'm pretty sure what's happening here are some of the same race conditions that were causing issues with logstash-output-google_bigquery around fd leaks.

This should be fixable by reworking the plugin to use a single queue for files needing to be uploaded and only adding a file to that queue IF it's actually ready i.e. the current file shouldn't be in there like it is today. The worker pool should also help #5