orcfax / Incidents

A repository to triage and report issues in Orcfax network operations
1 stars 0 forks source link

INCIDENT 017 | Duplicate tx-builder-grpc processes interfered with processing #20

Open Christian-MK opened 8 months ago

Christian-MK commented 8 months ago

Trigger

Date

2024-02-15

Summary

The duplication of a background transaction building GRPC service in the Cardano Open Oracle Protocol (COOP) environment prevented requests from making it on-chain.

Status

Resolved

Assessment

A secondary transaction builder rocess spawned in the COOP environment preventing transactions from making it on-chain. The cause of the secondary process appearing is unclear, however, maintenance work that day might have had an impact. The incident prevented communication with the transaction builder that results in datum being published on-chain.

The last transaction on-chain before the incident was at 15:44 UTC and the resolution was at 18:14 UTC (2.5 hours total).

Additional Notes

While we were able to observe a publication event being requested and two further GRPC processes interacting with one another, we were unable to see any output in the transaction builder logs. This lack of information made it difficult to identify the issue, and appropriate actions.

The cause of this issue was small but the impact was significant. Debugging procedures will be updated to reflect new information regarding what happened and steps taken to resolve the issue.

Transaction building processes are currently being replaced as part of ongoing improvements to the Orcfax network.

Technical improvements

We are investigating:

  1. Replacing transaction building components with more advanced logging.

Documentation improvements

  1. Debug procedures will be updated for Devops and other network maintainers.