hyperledger / indy-vdr

A library and proxy server for interacting with Hyperledger Indy Node ledger instances
Apache License 2.0
54 stars 71 forks source link

Poisoned lock error appearing recurringly #313

Open esune opened 3 months ago

esune commented 3 months ago

The following error appears sporadically, and it seems to be related or consequential to pool timeout errors:

2024-08-02 11:59:47,672 aiohttp.server ERROR Error handling request
Traceback (most recent call last):
  File "/home/indy/.local/lib/python3.7/site-packages/aiohttp/web_protocol.py", line 433, in _handle_request
    resp = await request_handler(request)
  File "/home/indy/.local/lib/python3.7/site-packages/aiohttp/web_app.py", line 504, in _handle
    resp = await handler(request)
  File "/home/indy/tails_server/web.py", line 109, in put_file
    genesis_txn_bytes, revocation_reg_id, storage_path
  File "/home/indy/tails_server/ledger.py", line 29, in get_rev_reg_def
    pool = await indy_vdr.open_pool(transactions_path=tmp_file.name)
  File "/home/indy/.local/lib/python3.7/site-packages/indy_vdr/pool.py", line 175, in open_pool
    pool = Pool(bindings.pool_create(params))
  File "/home/indy/.local/lib/python3.7/site-packages/indy_vdr/bindings.py", line 248, in pool_create
    do_call("indy_vdr_pool_create", params_p, byref(handle))
  File "/home/indy/.local/lib/python3.7/site-packages/indy_vdr/bindings.py", line 153, in do_call
    raise get_current_error(True)
indy_vdr.error.VdrError: Unexpected error: Error acquiring write lock: poisoned lock: another task failed inside

It has surfaced often in https://github.com/bcgov/indy-tails-server after sustained load caused by the ACA-Py integration tests

ianco commented 2 months ago

The tails server is using vdr version 0.4.2 and the latest is 0.4.3 so possibly a version upgrade will address the issue.

I'm going to try to reproduce the error locally with the tails server with version 0.4.2 and run some load with the Aca-Py integration tests ...

ianco commented 2 months ago

I also suggest setting RUST_BACKTRACE=1 (or RUST_BACKTRACE=full) in the tails server to give us a rust backtrace when we get the error.

I haven't been able to reproduce this error locally, and looking through the code I don't see anything obious (at least not obvious to me). The rust backtrace should give us some more clues.

esune commented 2 months ago

The tails server is using vdr version 0.4.2 and the latest is 0.4.3 so possibly a version upgrade will address the issue.

I'm going to try to reproduce the error locally with the tails server with version 0.4.2 and run some load with the Aca-Py integration tests ...

It does not look like 0.4.3 is on Pypi - might only have been released for the JS wrapper?

ianco commented 2 months ago

It does not look like 0.4.3 is on Pypi - might only have been released for the JS wrapper?

There's a previous ticket for when 0.4.2 was released, I'm not sure who looks after it now. @swcurran ?

In any case we can add the RUST_BACKTRACE setting to the tails server, it should just be an environment update.

swcurran commented 1 month ago

@andrewwhitehead — can you help with this? How do we get a release out?