Open tatianafrank opened 5 years ago
Can you be a bit more specific about not working? Do you see any errors?
Here is the error:
[main] WARN [NativeCodeLoader]: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable divolte | Exception in thread "main" 2019-07-29 15:20:24.908Z [main] ERROR [HdfsFileManager]: Could not initialize HDFS filesystem or failed to check for existence of publish and / or working directories.. divolte | org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "hdfs" divolte | at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3332)
Then I added fs.file.impl = "org.apache.hadoop.fs.LocalFileSystem" and fs.hdfs.impl = "org.apache.hadoop.hdfs.DistributedFileSystem" to my hdfs configuration in divolte and now im getting a different error:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(Ljava/lang/String;)Ljava/net/InetSocketAddress; divolte | at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:99)
according to this thread (https://stackoverflow.com/questions/45460909/accessing-hdfs-in-java-throws-error) there is an issue with dependency versions in divolte but im not sure who to change that in divolte...
This pull request may help : https://github.com/divolte/divolte-collector/pull/244
My kafka sink is working but my HDFS sink is not working. Im using hdfs 2.0 so that might be why? Ive got divolte running in a docker container and a hadoop cluster running in the same docker-compose network which I got from https://github.com/big-data-europe/docker-hadoop
here is are the relevant parts of my divolte-collector.conf (some parts stripped for brevity):
For fs.DEFAULT_FS, Ive tried hdfs://localhost:9870 and hdfs://namenode:9870 (namenode is the name of the hdfs namenode container running in the same docker network)