Open iaroslav-ai opened 8 years ago
I was trying to use webhdfs too and found a problem. In my case the problem is that webhdfs redirect to a data node every time i try to write to a file. And it seems that the redirection URL use the internal machine name of the docker image (something like a65ec753065c). Any idea about this?
The following is an example request:
curl -i -X PUT -T ~/Downloads/JEA_BLOWER_DEFINITION.csv "http://localhost:50070/webhdfs/v1/user/root/f.txt?op=CREATE&user.name=root&overwrite=true"
HTTP/1.1 100 Continue
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Wed, 19 Oct 2016 03:45:02 GMT
Date: Wed, 19 Oct 2016 03:45:02 GMT
Pragma: no-cache
Expires: Wed, 19 Oct 2016 03:45:02 GMT
Date: Wed, 19 Oct 2016 03:45:02 GMT
Pragma: no-cache
Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1476884702571&s=n+WgHqacT3Q5OthGXHXPBtD2YlQ="; Path=/; Expires=Wed, 19-Oct-2016 13:45:02 GMT; HttpOnly
Location: http://a65ec753065c:50075/webhdfs/v1/user/root/f.txt?op=CREATE&user.name=root&namenoderpcaddress=a65ec753065c:9000&overwrite=true
Content-Type: application/octet-stream
Content-Length: 0
Server: Jetty(6.1.26)
I am having the same issue as above, will it be addressed soon?
I also have the same problem, already tried all python libraries available. Did anyone solve this with a magical workaround?
Not sure how this would translate if using docker-compose but can get this to work using:
docker run -h localhost -p 50070:50070 -p 50075:50075 <<Container_Name>>
@PhilipMourdjis if you're using docker-compose you can put hostname localhost like this:
hadoop:
image: <image_name>
hostname: localhost
ports:
- 50070:50070
- 50075:50075
Just follow the redirect message Location
Notice:Step 2: Submit another HTTP PUT request using the URL in the Location header (or the returned response in case you specified noredirect) with the file data to be written. FYI Link
So far I was not able to use webhdfs with docker version of hadoop [on Ubuntu]. Here is what I tried:
1) Add a text file at user/root/f.txt :
2) Try reading contents of the file from hdfs:
For which I get
I tried using 3 different python libraries for webhdfs, but none work either. All of them stop with message similar to
when trying to create a file or folder. I also tried rebuilding the docker image to account for port 9000 not exposed, but that did seem to help. Am I doing something utterly wrong? I expect this to be likely given that I am a total had00p n00b :)