Podcastindex-org / podping-hivewriter

The hive writer component of podping.
MIT License
15 stars 5 forks source link

Non standard RPC error response from Hive RPC node #51

Closed brianoflondon closed 2 years ago

brianoflondon commented 2 years ago

This error... which I think stems from a non standard error response from one of the Hive RPC nodes (possibly rpc.ausbit.dev) I think can be fixed.

2022-08-06T02:10:53+0000 | INFO | Lighthive Fastest: https://rpc.ausbit.dev
2022-08-06T02:10:53+0000 | INFO | Status - Uptime: 13:21:33 | IRIs Received: 123 | IRIs Deduped: 123 | IRIs Sent: 123 | last_node: https://rpc.ausbit.dev
2022-08-06T02:12:39+0000 | INFO | IRI batch_id a6481954-6d91-46db-a40a-7f80333aea8e - Size of IRIs: 43
2022-08-06T02:12:39+0000 | INFO | Received 1 IRIs
2022-08-06T02:12:39+0000 | ERROR | send_notification error: Internal Error
2022-08-06T02:12:39+0000 | ERROR | Failed to send 1 IRIs
Traceback (most recent call last):
  File "/home/podping/app/src/podping_hivewriter/cli/podping.py", line 245, in server
    loop = asyncio.get_running_loop()
  File "/opt/pypy/lib/pypy3.8/asyncio/events.py", line 685, in get_running_loop
    raise RuntimeError('no running event loop')
RuntimeError: no running event loop

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/podping/app/src/podping_hivewriter/podping_hivewriter.py", line 484, in failure_retry
    reason=reason or self.reason,
  File "/home/podping/app/src/podping_hivewriter/podping_hivewriter.py", line 455, in send_notification_iris
    await self.send_notification(payload.dict(), hive_operation_id)
  File "/home/podping/app/src/podping_hivewriter/podping_hivewriter.py", line 417, in send_notification
    raise ex
  File "/home/podping/app/src/podping_hivewriter/podping_hivewriter.py", line 400, in send_notification
    await self._async_hive_broadcast(op=op, dry_run=self.dry_run)
  File "/home/podping/app/src/podping_hivewriter/async_wrapper.py", line 32, in wrapper
    return await async_fn(*args, **kwargs)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/asgiref/sync.py", line 435, in __call__
    ret = await asyncio.wait_for(future, timeout=None)
  File "/opt/pypy/lib/pypy3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/opt/pypy/lib/pypy3.8/asyncio/futures.py", line 260, in __await__
    yield self  # This tells Task to wait for completion.
  File "/opt/pypy/lib/pypy3.8/asyncio/tasks.py", line 349, in __wakeup
    future.result()
  File "/opt/pypy/lib/pypy3.8/asyncio/futures.py", line 178, in result
    raise self._exception
  File "/opt/pypy/lib/pypy3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/pypy/lib/pypy3.8/_functools.py", line 80, in __call__
    return self._func(*(self._args + fargs), **fkeywords)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/asgiref/sync.py", line 474, in thread_handler
    return func(*args, **kwargs)
  File "/opt/pypy/lib/pypy3.8/_contextvars.py", line 43, in run
    return callable(*args, **kwargs)
  File "/opt/pypy/lib/pypy3.8/_functools.py", line 80, in __call__
    return self._func(*(self._args + fargs), **fkeywords)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/client.py", line 185, in broadcast
    op, chain=self.chain, dry_run=dry_run)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/broadcast/transaction_builder.py", line 126, in broadcast
    self.prepare()
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/broadcast/transaction_builder.py", line 36, in prepare
    {"block_num": properties["head_block_number"] - 2})["block"]
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/client.py", line 53, in callable
    return self.request(attr, *args, **kwargs)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/client.py", line 152, in request
    self.validate_response(response)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/client.py", line 167, in validate_response
    raw_body=response,
lighthive.exceptions.RPCNodeException: Internal Error
2022-08-06T02:12:39+0000 | INFO | Current node: https://rpc.ausbit.dev
2022-08-06T02:12:39+0000 | INFO | ['https://api.hive.blog', 'https://hived.emre.sh', 'https://api.deathwing.me', 'https://rpc.ausbit.dev', 'https://api.pharesim.me', 'https://rpc.ecency.com', 'https://hive-api.arcange.eu', 'https://api.openhive.network', 'https://api.hive.blue', 'https://techcoderx.com']
2022-08-06T02:12:39+0000 | ERROR | Unexpected condition in error text from Hive
Traceback (most recent call last):
  File "/home/podping/app/src/podping_hivewriter/cli/podping.py", line 245, in server
    loop = asyncio.get_running_loop()
  File "/opt/pypy/lib/pypy3.8/asyncio/events.py", line 685, in get_running_loop
    raise RuntimeError('no running event loop')
RuntimeError: no running event loop

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/podping/app/src/podping_hivewriter/podping_hivewriter.py", line 484, in failure_retry
    reason=reason or self.reason,
  File "/home/podping/app/src/podping_hivewriter/podping_hivewriter.py", line 455, in send_notification_iris
    await self.send_notification(payload.dict(), hive_operation_id)
  File "/home/podping/app/src/podping_hivewriter/podping_hivewriter.py", line 417, in send_notification
    raise ex
  File "/home/podping/app/src/podping_hivewriter/podping_hivewriter.py", line 400, in send_notification
    await self._async_hive_broadcast(op=op, dry_run=self.dry_run)
  File "/home/podping/app/src/podping_hivewriter/async_wrapper.py", line 32, in wrapper
    return await async_fn(*args, **kwargs)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/asgiref/sync.py", line 435, in __call__
    ret = await asyncio.wait_for(future, timeout=None)
  File "/opt/pypy/lib/pypy3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/opt/pypy/lib/pypy3.8/asyncio/futures.py", line 260, in __await__
    yield self  # This tells Task to wait for completion.
  File "/opt/pypy/lib/pypy3.8/asyncio/tasks.py", line 349, in __wakeup
    future.result()
  File "/opt/pypy/lib/pypy3.8/asyncio/futures.py", line 178, in result
    raise self._exception
  File "/opt/pypy/lib/pypy3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/pypy/lib/pypy3.8/_functools.py", line 80, in __call__
    return self._func(*(self._args + fargs), **fkeywords)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/asgiref/sync.py", line 474, in thread_handler
    return func(*args, **kwargs)
  File "/opt/pypy/lib/pypy3.8/_contextvars.py", line 43, in run
    return callable(*args, **kwargs)
  File "/opt/pypy/lib/pypy3.8/_functools.py", line 80, in __call__
    return self._func(*(self._args + fargs), **fkeywords)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/client.py", line 185, in broadcast
    op, chain=self.chain, dry_run=dry_run)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/broadcast/transaction_builder.py", line 126, in broadcast
    self.prepare()
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/broadcast/transaction_builder.py", line 36, in prepare
    {"block_num": properties["head_block_number"] - 2})["block"]
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/client.py", line 53, in callable
    return self.request(attr, *args, **kwargs)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/client.py", line 152, in request
    self.validate_response(response)
  File "/home/podping/app/.venv/lib/pypy3.8/site-packages/lighthive/client.py", line 167, in validate_response
    raw_body=response,
lighthive.exceptions.RPCNodeException: Internal Error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/podping/app/src/podping_hivewriter/podping_hivewriter.py", line 495, in failure_retry
    ex.raw_body["error"]["data"]["name"]
KeyError: 'name'
brianoflondon commented 2 years ago

I don't think the case which is tested by test_#48 is covering this retry problem. It also causes the writer to dump out and potentially loose one or more of the IRIs which have been sent to write out.

I can see this with my 3pseak system because I have a backup server which watches to see if any given podping actually makes it to the chain. When one or more fail to be written, the backup server acts. That's what happend with this crash I believe.

brianoflondon commented 2 years ago

I don't think the case which is tested by test_#48 is covering this retry problem. It also causes the writer to dump out and potentially loose one or more of the IRIs which have been sent to write out.

I can see this with my 3pseak system because I have a backup server which watches to see if any given podping actually makes it to the chain. When one or more fail to be written, the backup server acts. That's what happend with this crash I believe.

brianoflondon commented 2 years ago

I don't think the case which is tested by test_#48 is covering this retry problem. It also causes the writer to dump out and potentially loose one or more of the IRIs which have been sent to write out.

I can see this with my 3pseak system because I have a backup server which watches to see if any given podping actually makes it to the chain. When one or more fail to be written, the backup server acts. That's what happend with this crash I believe.

daveajones commented 2 years ago

I didn’t realize an IRI was being lost here when this happens. Good catch Brian.

brianoflondon commented 2 years ago

I've built a version of this including my fix and I'm running it on my two 3speak systems so I'll watch it for a few days and see if this crops up gain.

agates commented 2 years ago

Can you please test the version in this branch? I made a slight change.

https://github.com/Podcastindex-org/podping-hivewriter/tree/hotfix/1.2.5

If I can't catch the memory leak tonight, I'll publish this version as-is.

agates commented 2 years ago

Also for what it's worth, I knew this was an issue when updating the hive-watcher for @daveajones, but didn't realize we were handling it differently.

brianoflondon commented 2 years ago

I'm now running a rebuilt docker from the /hotfix/1.2.5-alpha.1 branch which I made. Git gave me all sorts of Detached HEAD complaints with your hotfix/1.2.5 tag.

The only errors I consistently see are these:

podping-3spk         | 2022-08-07T02:08:10+0000 | INFO | Lighthive Fastest: https://hive-api.arcange.eu
podping-3spk         | 2022-08-07T02:08:10+0000 | INFO | Status - Uptime: 15:49:29 | IRIs Received: 123 | IRIs Deduped: 123 | IRIs Sent: 123 | last_node: https://hive-api.arcange.eu
podping-3spk         | Exception ignored in: weakref callback  <function WeakKeyDictionary.__init__.<locals>.remove at 0x000000000523a520>
podping-3spk         | Traceback (most recent call last):
podping-3spk         |   File "/opt/pypy/lib/pypy3.8/weakref.py", line 377, in remove
podping-3spk         |     del self.data[k]
podping-3spk         | KeyError: <weakref at 0x0000000001bc1cc0; dead>
podping-3spk         | Exception ignored in: weakref callback  <function WeakKeyDictionary.__init__.<locals>.remove at 0x000000000523a520>
podping-3spk         | Traceback (most recent call last):
podping-3spk         |   File "/opt/pypy/lib/pypy3.8/weakref.py", line 377, in remove
podping-3spk         |     del self.data[k]
podping-3spk         | KeyError: <weakref at 0x0000000002272c00; dead>
podping-3spk         | Exception ignored in: weakref callback  <function WeakKeyDictionary.__init__.<locals>.remove at 0x000000000523a520>
podping-3spk         | Traceback (most recent call last):
podping-3spk         |   File "/opt/pypy/lib/pypy3.8/weakref.py", line 377, in remove
podping-3spk         |     del self.data[k]
podping-3spk         | KeyError: <weakref at 0x0000000002272c40; dead>
podping-3spk         | 2022-08-07T02:11:11+0000 | INFO | Lighthive Fastest: https://api.deathwing.me
agates commented 2 years ago

Git gave me all sorts of Detached HEAD complaints with your hotfix/1.2.5 tag.

All that means is you didn't checkout the branch correctly.

brianoflondon commented 2 years ago

Git gave me all sorts of Detached HEAD complaints with your hotfix/1.2.5 tag.

All that means is you didn't checkout the branch correctly.

I have a PhD in not doing things properly with git.

brianoflondon commented 2 years ago

This is what I get when I try to checkout your branch

image
agates commented 2 years ago

Closed with v1.2.5