Closed xiaoleihuang closed 8 years ago
I couldn't find the dependency of hadoop in your docker file, because you're using org.apache.hadoop.fs Filesystem to load your config file and checkpointing.
*Also, better to load your config file using local FS calls and checkpointing with Hadoop FS, also what's the checkpointing interval?
Hi, Sameera: @SamTube405 The dependency has been configured in the docker image: sequenceiq. See its docker file line 1 "FROM sequenceiq/hadoop-docker:2.6.0". And I take advantage of their spark image, the hadoop is predefined. Should I explicitly declare I should use the local FS method call? I use the method claimed here. For the interval, I use the default settings: 10 seconds or every batch processing interval(iff the interval is larger than 10s).
Take advantage of Sequenceiq/spark to dockerize whole project. Some modifications: Use Java 8, Set encoding to UTF-8;
Finish some how to do contents in Readme.md