HSDS Pods errors - Githubissues

bilalshaikh42 commented 2 years ago

Hello, I have seen the following errors pop up on the error reporting. They seem to all be related. I am trying to figure out what could be causing these. I suspect some connection to the bucket failed, but rather than handling it gracefully, one of the pods may have crashed. This then causes the lockup issue described in #104. I am not sure if this is the sequence of events but seems like it based on observing the requests ( most recent error last).

 Traceback (most recent call last):
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/datanode.py", line 147, in bucketScan
    await scanRoot(app, root_id, update=True, bucket=bucket)
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/async_lib.py", line 436, in scanRoot
    await putStorJSONObj(app, info_key, results, bucket=bucket)
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/util/storUtil.py", line 393, in putStorJSONObj
    rsp = await client.put_object(key, data, bucket=bucket)
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/util/s3Client.py", line 357, in put_object
    raise HTTPInternalServerError()
aiohttp.web_exceptions.HTTPInternalServerError: Internal Server Error

 Traceback (most recent call last):
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/datanode_lib.py", line 884, in s3syncCheck
    update_count = await s3sync(app)
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/datanode_lib.py", line 865, in s3sync
    await notify_root(app, root_id, bucket=bucket)
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/datanode_lib.py", line 85, in notify_root
    await http_post(app, notify_req, data={}, params=params)
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/util/httpUtil.py", line 298, in http_post
    async with client.post(url, **kwargs) as rsp:
  File "/opt/env/hsds/lib/python3.8/site-packages/aiohttp/client.py", line 1117, in __aenter__
    self._resp = await self._coro
  File "/opt/env/hsds/lib/python3.8/site-packages/aiohttp/client.py", line 544, in _request
    await resp.start(conn)
  File "/opt/env/hsds/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 905, in start
    self._continue = None
  File "/opt/env/hsds/lib/python3.8/site-packages/aiohttp/helpers.py", line 656, in __exit__
    raise asyncio.TimeoutError from None

 Traceback (most recent call last):
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/datanode.py", line 147, in bucketScan
    await scanRoot(app, root_id, update=True, bucket=bucket)
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/async_lib.py", line 384, in scanRoot
    await getStorKeys(app, **kwargs)
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/util/storUtil.py", line 473, in getStorKeys
    key_names = await client.list_keys(**kwargs)
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/util/s3Client.py", line 588, in list_keys
    raise HTTPInternalServerError()
aiohttp.web_exceptions.HTTPInternalServerError: Internal Server Error

 Traceback (most recent call last):
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/util/s3Client.py", line 567, in list_keys
    async for page in paginator.paginate(
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/paginate.py", line 32, in __anext__
    response = await self._make_request(current_kwargs)
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/client.py", line 211, in _make_api_call
    http, parsed_response = await self._make_request(
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/client.py", line 231, in _make_request
    return await self._endpoint.make_request(operation_model, request_dict)
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/endpoint.py", line 81, in _send_request
    while await self._needs_retry(attempts, operation_model,
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/endpoint.py", line 213, in _needs_retry
    responses = await self._event_emitter.emit(
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/hooks.py", line 29, in _emit
    response = handler(**kwargs)
  File "/opt/env/hsds/lib/python3.8/site-packages/botocore/retryhandler.py", line 183, in __call__
    if self._checker(attempts, response, caught_exception):
  File "/opt/env/hsds/lib/python3.8/site-packages/botocore/retryhandler.py", line 250, in __call__
    should_retry = self._should_retry(attempt_number, response,
  File "/opt/env/hsds/lib/python3.8/site-packages/botocore/retryhandler.py", line 269, in _should_retry
    return self._checker(attempt_number, response, caught_exception)
  File "/opt/env/hsds/lib/python3.8/site-packages/botocore/retryhandler.py", line 316, in __call__
    checker_response = checker(attempt_number, response,
  File "/opt/env/hsds/lib/python3.8/site-packages/botocore/retryhandler.py", line 222, in __call__
    return self._check_caught_exception(
  File "/opt/env/hsds/lib/python3.8/site-packages/botocore/retryhandler.py", line 359, in _check_caught_exception
    raise caught_exception
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/endpoint.py", line 147, in _do_get_response
    http_response = await self._send(request)
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/endpoint.py", line 229, in _send
    return await self.http_session.send(request)
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/httpsession.py", line 224, in send
    raise HTTPClientError(error=e)
botocore.exceptions.HTTPClientError: An HTTP Client raised an unhandled exception: [Errno 32] Broken pipe

 Traceback (most recent call last):
  File "/opt/env/hsds/lib/python3.8/site-packages/aiobotocore/httpsession.py", line 172, in send
    resp = await self._session.request(
  File "/opt/env/hsds/lib/python3.8/site-packages/aiohttp/client.py", line 559, in _request
    await resp.start(conn)
  File "/opt/env/hsds/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 898, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
  File "/opt/env/hsds/lib/python3.8/site-packages/aiohttp/streams.py", line 616, in read
    await self._waiter
aiohttp.client_exceptions.ClientOSError: [Errno 32] Broken pipe

Traceback (most recent call last):
  File "/opt/env/hsds/lib/python3.8/site-packages/hsds/datanode.py", line 163, in bucketScan
    log.error()
TypeError: error() missing 1 required positional argument: 'msg'