Netflix / suro

Netflix's distributed Data Pipeline
Apache License 2.0
794 stars 170 forks source link

Added support for and AvroFileWriter #252

Open Crystark opened 9 years ago

Crystark commented 9 years ago

Hi,

I did this some time ago and I thought you might be interested in having this in suro's master branch. It's a basic avro file writer that you can use when configuring your sink. The schema must be provided as part of the configuration.

For instance

  "item-local-sink": {
        "type": "local",
        "maxFileSize": "1048576000",
        "rotationPeriod": "PT1m",
        "outputDir": "/data/surodata/local/item",
        "writer": {
            "type": "avro",
            "schema": "{\"type\":\"record\",\"name\":\"Item\",\"namespace\":\"my.app.namespace\",\"fields\":[{\"name\":\"timestamp\",\"type\":\"long\"},{\"name\":\"name\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"}},{\"name\":\"description\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"}},{\"name\":\"option\",\"type\":[\"null\",{\"type\":\"string\",\"avro.java.string\":\"String\"}],\"default\":null},{\"name\":\"type\",\"type
\":[\"null\",{\"type\":\"string\",\"avro.java.string\":\"String\"}],\"default\":null},{\"name\":\"price\",\"type\":[\"null\",\"float\"],\"default\":null}]}"
        }
   }

This is pretty basic but it's been really useful in our case.

cloudbees-pull-request-builder commented 9 years ago

NetflixOSS » suro » suro-pull-requests #71 FAILURE Looks like there's a problem with this pull request

cloudbees-pull-request-builder commented 9 years ago

suro-pull-requests #239 FAILURE Looks like there's a problem with this pull request