big-data-europe / docker-hadoop

Apache Hadoop docker image
2.18k stars 1.27k forks source link

Unable to establish connection to datanode when using pyhdfs to access HDFS #124

Closed DAYceng closed 2 years ago

DAYceng commented 2 years ago

I modified docker-compose.yml as suggested in issues98 and built a hadoop cluster.And I can use python to CRUD on HDFS on my server. The test code is as follows:

import pyhdfs
fs = pyhdfs.HdfsClient(hosts="xx.xx.48.xx:9870", user_name="root")
userhomedir = fs.get_home_directory()
print(userhomedir)
availablenode = fs.get_active_namenode()
print(availablenode)
fs.mkdirs("/data")
print(fs.listdir("/"))
fs.copy_from_local("/root/docker-hadoop/test.log", "/data/test.log", overwrite=True)  
fs.copy_to_local("/data/test.log", 'test2.log')

However, running the same code in a jupyter notebook container built on the same server gives an error I also used docker-compose.yml when building the jupyter notebook container, so jupyter and hadoop are in two different docker networks (I don't know if this is the cause of the error) The specific error is as follows:

---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/urllib3/connection.py in _new_conn(self)
    159             conn = connection.create_connection(
--> 160                 (self._dns_host, self.port), self.timeout, **extra_kw
    161             )

/opt/conda/lib/python3.7/site-packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
     83     if err is not None:
---> 84         raise err
     85 

/opt/conda/lib/python3.7/site-packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
     73                 sock.bind(source_address)
---> 74             sock.connect(sa)
     75             return sock

TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

NewConnectionError                        Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    676                 headers=headers,
--> 677                 chunked=chunked,
    678             )

/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    391         else:
--> 392             conn.request(method, url, **httplib_request_kw)
    393 

/opt/conda/lib/python3.7/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1251         """Send a complete request to the server."""
-> 1252         self._send_request(method, url, body, headers, encode_chunked)
   1253 

/opt/conda/lib/python3.7/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1297             body = _encode(body, 'body')
-> 1298         self.endheaders(body, encode_chunked=encode_chunked)
   1299 

/opt/conda/lib/python3.7/http/client.py in endheaders(self, message_body, encode_chunked)
   1246             raise CannotSendHeader()
-> 1247         self._send_output(message_body, encode_chunked=encode_chunked)
   1248 

/opt/conda/lib/python3.7/http/client.py in _send_output(self, message_body, encode_chunked)
   1025         del self._buffer[:]
-> 1026         self.send(msg)
   1027 

/opt/conda/lib/python3.7/http/client.py in send(self, data)
    965             if self.auto_open:
--> 966                 self.connect()
    967             else:

/opt/conda/lib/python3.7/site-packages/urllib3/connection.py in connect(self)
    186     def connect(self):
--> 187         conn = self._new_conn()
    188         self._prepare_conn(conn)

/opt/conda/lib/python3.7/site-packages/urllib3/connection.py in _new_conn(self)
    171             raise NewConnectionError(
--> 172                 self, "Failed to establish a new connection: %s" % e
    173             )

NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f5d0b9137d0>: Failed to establish a new connection: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    448                     retries=self.max_retries,
--> 449                     timeout=timeout
    450                 )

/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    724             retries = retries.increment(
--> 725                 method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
    726             )

/opt/conda/lib/python3.7/site-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    438         if new_retry.is_exhausted():
--> 439             raise MaxRetryError(_pool, url, error or ResponseError(cause))
    440 

MaxRetryError: HTTPConnectionPool(host='datanode2', port=9864): Max retries exceeded with url: /webhdfs/v1/data/asn_enrich.csv?op=CREATE&user.name=root&namenoderpcaddress=namenode:9000&createflag=&createparent=true&overwrite=true (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5d0b9137d0>: Failed to establish a new connection: [Errno 110] Connection timed out'))

During handling of the above exception, another exception occurred:

ConnectionError                           Traceback (most recent call last)
<ipython-input-5-8619b090793c> in <module>
      1 # 从本地上传到hadoop上
----> 2 fs.copy_from_local("asn_enrich.csv", "/data/asn_enrich.csv", overwrite=True)
      3 # 从hadoop下载到本地
      4 fs.copy_to_local("/data/asn_enrich.csv", 'asn_enrich2.csv')

/opt/conda/lib/python3.7/site-packages/pyhdfs/__init__.py in copy_from_local(self, localsrc, dest, **kwargs)
    873         """
    874         with open(localsrc, 'rb') as f:
--> 875             self.create(dest, f, **kwargs)
    876 
    877     def copy_to_local(self, src: str, localdest: str, **kwargs: _PossibleArgumentTypes) -> None:

/opt/conda/lib/python3.7/site-packages/pyhdfs/__init__.py in create(self, path, data, **kwargs)
    502         assert not metadata_response.content
    503         data_response = self._requests_session.put(
--> 504             metadata_response.headers['location'], data=data, **self._requests_kwargs)
    505         _check_response(data_response, expected_status=HTTPStatus.CREATED)
    506         assert not data_response.content

/opt/conda/lib/python3.7/site-packages/requests/api.py in put(url, data, **kwargs)
    132     """
    133 
--> 134     return request('put', url, data=data, **kwargs)
    135 
    136 

/opt/conda/lib/python3.7/site-packages/requests/api.py in request(method, url, **kwargs)
     59     # cases, and look like a memory leak in others.
     60     with sessions.Session() as session:
---> 61         return session.request(method=method, url=url, **kwargs)
     62 
     63 

/opt/conda/lib/python3.7/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    528         }
    529         send_kwargs.update(settings)
--> 530         resp = self.send(prep, **send_kwargs)
    531 
    532         return resp

/opt/conda/lib/python3.7/site-packages/requests/sessions.py in send(self, request, **kwargs)
    641 
    642         # Send the request
--> 643         r = adapter.send(request, **kwargs)
    644 
    645         # Total elapsed time of the request (approximately)

/opt/conda/lib/python3.7/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    514                 raise SSLError(e, request=request)
    515 
--> 516             raise ConnectionError(e, request=request)
    517 
    518         except ClosedPoolError as e:

ConnectionError: HTTPConnectionPool(host='datanode2', port=9864): Max retries exceeded with url: /webhdfs/v1/data/asn_enrich.csv?op=CREATE&user.name=root&namenoderpcaddress=namenode:9000&createflag=&createparent=true&overwrite=true (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5d0b9137d0>: Failed to establish a new connection: [Errno 110] Connection timed out'))

The jupyter notebook is constructed as follows:

jupyter:
    container_name: sj-jupyter
    image: sj-jupyter/all-spark-notebook
    build: jupyter/.
    restart: always
    depends_on:
      - "elasticsearch"
    ports:
      - "8888:8888"
    volumes:
      - ./jupyter/notebooks:/home/jovyan/work
    networks:
      - elastinet

In general, I deploy docker-hadoop on the server, and I can use the server to do CRUD to hadoop through pyhdfs. But cannot upload/download files using other docker containers on the server (network) or local computer.

Combined with the error message, I guess the cause of the problem is that the process of using pyhdfs to read the HDFS path only needs to access the namenode, and the namenode can be accessed normally, so operations such as .get_home_directory() and .mkdirs("/data") do not report errors.

However, to achieve file reading and writing, it is necessary to access the datanode. The datanode cannot be accessed by the external network, so other containers or local computers cannot perform upload/download operations.

Because the configuration item dfs.client.use.datanode.hostname=true of hdfs-site.xml has been made in hadoop.env file, I tried to modify the Dockerfile of datanode Added the following statement: ENV HDFS_CONF_dfs_client_use_datanode_hostname=true but didn't work

So does anyone have the same problem? Please give me some ideas and help, thank you very much

DAYceng commented 2 years ago

I found a solution Containers that need to interact with hadoop can be connected to the docker network where hadoop is located docker network connect docker-hadoop_default {your_containerName} Just use the above command docker network inspect docker-hadoop_defaultCheck if {your_containerName} is added to hadoop network

done.