quantadex / distributed_quanta_bridge

The distributed version of the quanta bridge
1 stars 0 forks source link

Monitoring #47

Open quocble opened 5 years ago

quocble commented 5 years ago

Motivation

There has been a number of incidents where deposits and withdrawals were delayed.... often due to : configuration, versioning issues, out of disk, code error (like SQL insert), and so forth.

Design

How can we build features within /api/status returning 200 - OK 503 - Service Unavailable 504 - Service Degraded

What should we track?

What does it mean to be in degraded?

What does it mean to be unavailable?

type BlockchainStatus {
   CurrentBlock int
  TimeSinceLastBlock int  // sec
  DegradedThreshold int //
  FailureThreshold int //
  TotalAddresses int 
  AddressesCreated24H int
  State: Normal | Degraded | Failure
}

type DepositStatus {
  ConsensusRetries: N,
  DegradedThreshold: N,
  FailureThreshold: N
}

type WithdrawalStatus {
  ConsensusRetries: N,
  DegradedThreshold: N,
  FailureThreshold: N
}

/api/status
{
  "BuildTime": "",
  "GitHash": "",
  "ListenIP": "192.168.137.186",
  "PublicKey": "QA6nkaBAz1vV6cb25vSHXJqHos1AeADzqRAXPvtASXMAhM3SbRFA",
  "Version": "1.0",
  "BTC": {
  },
  "BCH": {
  }
  "Deposit: {
  }
  "Withdrawal": {
  },
  TotalDegraded: N,
  TotalFailures: N
}

Returns 504 code when in Degraded state, totalDegraded > 0 Returns 503 code when in Failure state, totalFailure > 0 Return 200 when both are 0

Links

  1. https://stackoverflow.com/questions/53985294/what-should-the-http-status-code-of-a-degraded-health-check-be