Do I have to set up HDFS in order to use streamX?

qubole / streamx

kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)

Apache License 2.0

96 stars 54 forks source link

And I could not find the mentioned config/hadoop-conf in my installation (Kafka 0.10.2.0).

Kafka is not a Hadoop project, that is why you will not find it there. You must make this folder on your own. An EMR instance, or other EC2 Hadoop-provisioned machine would have this folder.

So do I have to use HDFS in order to use this streamX?

Not exactly, but you need to use a Hadoop compatibile filesystem (which S3 is).

Since this project uses the Hadoop FileSystem API, you need to just specify the configuration directory with the XML files included.

Using spark could achieve this target but it would require a long-running cluster to do

Kafka Connect consumers also typically are long-running, as part of a cluster / consumer-group.

qubole / streamx

Do I have to set up HDFS in order to use streamX? #60