Open hurtauda opened 7 years ago
I'm also experiencing this problem but on the cloudera platform. We cannot use webhdfs because it does not have HA capabilities compared to httpfs.
Sorry for missing this issue. I'm not familiar with HttpFS but if the WebHDFS and HttpFS are incompatibile in several operations, we should care it.
Is it a bad implementation of httpFS on MapR side
From enarciso comment, it seems HttpFS behaviour is same on several distribution. I'm not sure this is a bug of HttpFS or not.
I think append
operation should create new file when file doesn't exist.
My unfortunate workaround at the moment is to constantly monitor the httpfs logs, watch for string like above and run a touchz
to create the file. Thank you for look into this @repeatedly
WebHDFS::ServerError
means that the client (fluentd) receives HTTP response code 500 from HttpFs server. WebHDFS server returns 404 for such cases.
IMO it's a bug of HttpFs implementation, because of behavior incompatibility between WebHDFS and HttpFs.
And it (HttpFs) is interoperable with the webhdfs REST HTTP API. https://hadoop.apache.org/docs/r2.8.0/hadoop-hdfs-httpfs/index.html
Thank you @tagomoris, ive open a case with Cloudera.
Hello,
We are running a MapR custer and webHDFS is not supported by MapR. So we are trying to populate hadoop using httpFS.
Our Webhdfs config :
However when using the fluentd plugin, logs are appended correclty to an existing file. But if the file does not exist (using a timestamp-based filename), we get a WebHDFS::ServerError instead of a WebHDFS::FileNotFoundError that would create the file I guess.
Error 500 received by Mapr :
logs by fluentd-webhdfs plugin :
related code : https://github.com/fluent/fluent-plugin-webhdfs/blob/master/lib/fluent/plugin/out_webhdfs.rb#L262
What I am not sure and I can't find proper specifications for HttpFS on the web is :
Thank You Alban