Closed icc-garciaju closed 8 months ago
Can you provide some additional detail regarding you're network setup. You mention the von-network webserver. Is this an instance on von-network, or have you setup your own network? If you've setup your own network, could you provide some information on how you've set it up and how it's laid out? The queries that are failing, are you using the indy-cli for those or some other tool?
The network consist of six nodes on AWS using docker images with version 1.12.3. Four nodes were created using a custom script, and the other two using the init_indy_keys script, but the nodes worked fine for several months. They haven't experienced any connectivity issue.
The von-network's webserver was used only to check the recovered ledger.
I've tried indy-cli, an aca-py container and the webserver, but the webserver was unable to recover all transactions. The indy-cli and the aca-py were able to recover the role associated with a nym transaction, but not the schemas or the credential definitions.
But my question is more focused more on the recovery than in the error itself.
I've tested with a couple of domain ledger's dumps and the behavior is the same.
Steps to reproduce:
read_ledger --type domain --to -1 > domain_transactions_genesis
to dump the ledgerThe read_ledger
script was designed for debugging and troubleshooting. It was not designed or tested as a backup and recovery mechanism. The results of using the output from read_ledger
as the input for a genesis file are unknown. However, at best, you'll simply get an exact copy of the ledger, so there is little point anyway.
Indy is a distributed ledger, therefore all nodes have a copy of the data. Typically, in the rare cases when there are issues, it's only one or two nodes causing the issues. Deleting the data off those nodes and letting them resync to the ledger is all that is required.
Do you have a backup of the original data directories? If you do, I'd recommend restoring them and troubleshooting from there. That involves identifying the problem nodes and then taking steps to correct the issues with only those nodes. Identifying the problem involves reviewing and interpreting the logs.
Was there an event you can identify around the time the issue started? Node was added to or removed from the ledger? Node connectivity issues? Did you try restarting any of the nodes in the network while initially troubleshooting?
I couldn't identify any issues and no network nodes were added or removed at that specific point in time. All nodes were responding to read queries, and no "incorrect" messages were displayed in the logs. I only have control of four of six nodes. I tried to reboot them but shortly after being rebooted they got a lot of view changes in the queue.
All of my four nodes were able to dump up to the same transaction. And all of six nodes showed the same values for the Committed_ledger_root_hashes
property, so I don't think it was a data corruption issue.
As it was a non-prod blockchain, we disposed it, but I'm interested on a recovery mechanism for a production blockchain.
The best recommendation is to keep backups of the storage folders. It's recommended in the install docs; https://github.com/hyperledger/indy-node/tree/main/docs/source/install-docs
That said, I personally, have yet to encounter a situation where there has been an issue that has required restoring an entire network from backups. I've used them to speed up adding nodes to networks, but that's mostly it.
@lynnbendixsen and I have been managing Indy networks for years. I don't think he's encountered an issue requiring the whole network to be restored from backup either. @lynnbendixsen, anything to add?
It's the first time it happens to me also.
But being enable to load some transactions but not all of them, it's what looked suspicious to me.
But being enable to load some transactions but not all of them, it's what looked suspicious to me.
I think that is an artifact introduced by using read_ledger
to dump the transactions.
Thanks @WadeBarnes for your assistance.
I've been using read_ledger to update the pool_transactions_genesis used by aca-pys and indy clients with a 100% success, so I hoped this time would have worked for the domain ledger. As you said, that was not designed for debugging purposes, so any help it provides is welcomed, but nothing more can be requested.
Thank you very much.
I've had an issue with a blockchain, we ha a lot (>200) view changes in queue, and, according to https://github.com/hyperledger/indy-node/blob/main/docs/source/troubleshooting.md having a 43 is very bad, i had one resason 43, and 28 for the rest.
So I tried to dump the pool and domains ledgers and use the dumps as pool_transactions_genesis and domain_transactions_genesis respectively.
The nodes were able to boot, and the ledger was populated. The webserver for the von-network example displays the correct number of transactions for each ledger and I'm able to query nym transactions from the webserver and from the indy-cli and from an aca-py. But if I try to query an schema or a credential definition they return an empty result.
If I dump the ledger using read_ledger, I get all the transactions.
I used aca-py to publish a new schema, and I can recover it, but none of the previously created schemas.
Can anybody help me?
Thanks in advance.