divolte / divolte-collector

Divolte Collector
https://divolte.io/
Apache License 2.0
283 stars 77 forks source link

Setting up S3 buckets as directories in divolte configuration #155

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hi all, I am a noobie to divolte collector. I have been experimenting on capturing the clickstream data and try to store the same in predefined s3 buckets. Can anyone provide me with a sample code on adding s3 as the working and published directory? I tried following the getting started guide but I still find it difficult. Below is the code

Code

divolte { global { hdfs { // Enable HDFS sinks. enabled = true

  // Use multiple threads to write to HDFS.
  threads = 2
}

}

sinks { // The name of the sink. (It's referred to by the mapping.) hdfs { type = hdfs

  // For HDFS sinks we can control how the files are created.
  file_strategy {
    // Create a new file every minute
    roll_every = 1 minute   //Is this correct?

    // Perform a hsync call on the HDFS files after every 1000 records are written
    // or every 5 seconds, whichever happens first.

    // Performing a hsync call periodically can prevent data loss in the case of
    // some failure scenarios.
    sync_file_after_records = 1000
    sync_file_after_duration = 5 seconds

    // Files that are being written will be created in a working directory.
    // Once a file is closed, Divolte Collector will move the file to the
    // publish directory. The working and publish directories are allowed
    // to be the same, but this is not recommended.
    working_dir = "s3://your-bucket-name"
    publish_dir = "s3://your-bucket-name"  //is this the right way to define the directories?
  }

  // Set the replication factor for created files.
  replication = 3
}

} }

friso commented 7 years ago

It's probably better to take these kind of general inquiries to the Google Group / mailing list.

Also be sure to mention your version of Divolte Collector and provide any error messages that you have seen.

ghost commented 7 years ago

Thank you @friso!!