qubole / streamx

kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)
Apache License 2.0
96 stars 54 forks source link

NullPointerException when tasks.max > 1 and using s3a #38

Open levin81 opened 7 years ago

levin81 commented 7 years ago

Loaded kafka-connect distributed on 2 machines (using confluent kafka connect docker image). Configured it to work with streamx, created a job with tasks.max=1 while using s3a (configured in the hdfs-site.xml) - everything works fine.

Whenever I raise the number of tasks to be anything other than 1 I get the following error:

java.lang.NullPointerException
    at io.confluent.connect.hdfs.DataWriter.close(DataWriter.java:299)
    at io.confluent.connect.hdfs.HdfsSinkTask.close(HdfsSinkTask.java:110)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:302)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.closePartitions(WorkerSinkTask.java:435)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:147)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

I tried messing around with all the configurations, including kafka timeouts, heartbeat, etc. To no avail. After reading this issue: https://github.com/qubole/streamx/issues/30 - I tried using s3n instead and it works without exceptions now!

Thanks

PraveenSeluka commented 7 years ago

@levin81 - S3N is stable right now and am testing S3A now. Will have some updates soon for S3A.

PraveenSeluka commented 7 years ago

@levin81 Added a comment to https://github.com/qubole/streamx/issues/30 detailing the s3a issues there.