mojaloop / mojaloop-specification

This repo contains the specification document set of the Open API for FSP Interoperability
https://docs.mojaloop.io/api
Other
20 stars 40 forks source link

Settlements CGS Handler v15.0.3 failing on GP CGS tests #125

Closed mdebarros closed 1 year ago

mdebarros commented 1 year ago

Describe the bug

CGS test scenarios are failing in GP tests on Helm v15.2.0-rc release.

Timestamps observed by tracing a POST /scenario on the CGS OTS test-cases through the Position, Notification and CGS handlers show the following results

  1. Position Handler: 2023-10-18T15:46:00.168Z - Prepare Ingress 2023-10-18T15:46:00.232Z - Fulfil Ingress 2023-10-18T15:46:00.241Z - Fulfil Egress

  2. Notify Handler: 2023-10-18T15:46:00.187Z - Prepare Ingress 2023-10-18T15:46:00.247Z - Fulfil Ingress 2023-10-18T15:46:00.251Z - Fulfil Egress

  3. CGS handler: 2023-10-18T15:48:16.222Z - Prepare Ingress 2023-10-18T15:48:16.223Z - Fulfil Ingress 2023-10-18T15:48:16.225Z - updateTransferSettlement log entry

VERY luckily, all three handlers are on the same Kubernetes node/machine...this means that all time-clocks will be 100% in sync as per the following:

❯ kubectl get po -o wide | grep position
moja2-centralledger-handler-transfer-position-5847cdd497-487zd    2/2     Running   0             22h   10.1.6.11    ip-10-1-6-153.eu-west-2.compute.internal   <none>           <none>
❯ kubectl get po -o wide | grep sett

moja2-centralsettlement-handler-grosssettlement-6fc5f6b8b78dcwz   2/2     Running   0             22h   10.1.6.232   ip-10-1-6-153.eu-west-2.compute.internal   <none>           <none>

❯ kubectl get po -o wide | grep noti
moja2-ml-api-adapter-handler-notification-bd7589bd7-87lmm         2/2     Running   0             22h   10.1.6.56    ip-10-1-6-153.eu-west-2.compute.internal   <none>           <none>

Observation

One can see that it nearly takes the CGS handler more than 2m to actually process the transfer.

Actions taken

  1. We increased the artificial delay on 3m (default is 5s) which finally allowed the fully GP test-suite to pass, as this is longer than the observed 2m delay from tracing the transfer Ingress/Egress timestamps.

  2. The CGS Handler logs also had several errors occurring intermittently as follows:

2023-10-18T15:47:34.148Z - [32minfo[39m: “TransferFulfilHandler::getSettlementModelByTransferId”
2023-10-18T15:47:34.150Z - [31merror[39m: “transferSettlement::processMsgFulfil - error! The database must be connected to get a table object”
2023-10-18T15:47:37.151Z - [32minfo[39m: “TransferFulfilHandler::getSettlementModelByTransferId”
2023-10-18T15:47:37.153Z - [31merror[39m: “transferSettlement::processMsgFulfil - error! The database must be connected to get a table object”
2023-10-18T15:47:40.155Z - [32minfo[39m: “TransferFulfilHandler::getSettlementModelByTransferId”
2023-10-18T15:47:40.157Z - [31merror[39m: “transferSettlement::processMsgFulfil - error! The database must be connected to get a table object”
2023-10-18T15:47:40.157Z - [31merror[39m: “grossSettlementHandler::processTransferSettlement::validationPassed::The database must be connected to get a table object--0 The database must be connected to get a table object”
2023-10-18T15:47:40.158Z - [31merror[39m: {
  “type”: “application/json”,
  “content”: {
    “error”: {
      “name”: “FSPIOPError”,
      “cause”: “Error: The database must be connected to get a table object\n    at Database.from (/opt/app/node_modules/@mojaloop/database-lib/src/database.js:92:13)\n    at Object.getByIdLight (/opt/app/node_modules/@mojaloop/central-ledger/src/models/transfer/facade.js:126:21)\n    at Object.getSettlementModelByTransferId (/opt/app/src/models/transferSettlement/facade.js:341:52)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async Object.processMsgFulfil (/opt/app/src/domain/transferSettlement/index.js:38:36)\n    at async /opt/app/src/handlers/grossSettlement/handler.js:107:13”,
      “apiErrorCode”: {
        “code”: “2001",
        “message”: “Internal server error”,
        “name”: “INTERNAL_SERVER_ERROR”,
        “type”: {
          “regex”: “^20[0-9]{2}$“,
          “description”: “Generic Server Error”,
          “httpStatusCode”: 500,
          “name”: “GENERIC_SERVER_ERROR”
        },
        “httpStatusCode”: 500
      },
      “httpStatusCode”: 500,
      “useMessageAsDescription”: false,
      “message”: “The database must be connected to get a table object”,
      “stack”: “FSPIOPError: The database must be connected to get a table object\n    at createFSPIOPError (/opt/app/node_modules/@mojaloop/central-services-error-handling/src/factory.js:198:12)\n    at Object.reformatFSPIOPError (/opt/app/node_modules/@mojaloop/central-services-error-handling/src/factory.js:333:12)\n    at Object.getByIdLight (/opt/app/node_modules/@mojaloop/central-ledger/src/models/transfer/facade.js:171:32)\n    at Object.getSettlementModelByTransferId (/opt/app/src/models/transferSettlement/facade.js:341:52)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async Object.processMsgFulfil (/opt/app/src/domain/transferSettlement/index.js:38:36)\n    at async /opt/app/src/handlers/grossSettlement/handler.js:107:13\nError: The database must be connected to get a table object\n    at Database.from (/opt/app/node_modules/@mojaloop/database-lib/src/database.js:92:13)\n    at Object.getByIdLight (/opt/app/node_modules/@mojaloop/central-ledger/src/models/transfer/facade.js:126:21)\n    at Object.getSettlementModelByTransferId (/opt/app/src/models/transferSettlement/facade.js:341:52)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async Object.processMsgFulfil (/opt/app/src/domain/transferSettlement/index.js:38:36)\n    at async /opt/app/src/handlers/grossSettlement/handler.js:107:13"
    }
  },

Down-grading the CGS handler from v15.0.3 (inc NodeJS upgrades and fixes) to v15.0.0 (versioned used in Mojaloop v15.1.0 release), resulted in the processing delayed no longer being an issue.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context Add any other context about the problem here.

mdebarros commented 1 year ago

Closing.

Opened in wrong repo