Built-in Ledger Implementation: Facilitate highly available, strongly consistent, horizontal scaling of the core clearing algorithm.

bushjames commented 8 months ago

Related to issues EAB1, EAB2, ESR1, ESR2, EG3, EG8 in vNext alpha & beta reports by MLF:

The implemented clearing algorithm lacks a distributed mechanism for coordinating read and write access to DFSP liquidity balances. This means only a single instance of the ledger process can be run without corrupting DFSP liquidity balances.

The presented "beta" version of the ledger service holds account balances in-memory meaning they cannot be read and written to by other concurrent processes. Although ledger entries are written (note also the lack of atomicity in writing ledger entries and updating account balances - another ticket will be created to deal with that issue) and balances are updated in the underlying built-in ledger repository (balance updates missing from the tigerbeetle ledger implementation), these balances are never read back by the service during operation leaving the single-instance, in-memory cache as the only operational balance store.

A mutex type synchronisation mechanism is required to share liquidity balance data between concurrent processes in a "threadsafe" manner during transfer liquidity checks and funds reservation.
Stateless instances of the ledger service must be able to run concurrently and safely process transfers from any DFSP..
Liquidity balance and ledger entries must be stored in a strongly consistent, scaleable, redundant backing store with attributes that satisfy the requirements for national scale financial infrastructure i.e. well known, tried and tested, mature and well supported. For the avoidance of doubt, data stores for Mojaloop financial data must implement strong consistency.

pedrosousabarreto commented 8 months ago

Hi James, the synchronization mechanism you recommend in point 1 is there since the beginning, implemented in the GRPC entry point, as a consequence, everything after is synchronized and delivering the desired characteristics.

Do you have any actual numbers or tests that indicate that it fails to deliver on those characteristics?

bushjames commented 8 months ago

Hi James, the synchronization mechanism you recommend in point 1 is there since the beginning, implemented in the GRPC entry point, as a consequence, everything after is synchronized and delivering the desired characteristics.

Do you have any actual numbers or tests that indicate that it fails to deliver on those characteristics?

@pedrosousabarreto please could you point me to the code that does this.

bushjames commented 8 months ago

This issue was discussed at design authority on 2024-03-06. @MichaelJBRichards and @PaulMakinMojaloop mentioned that the MongoDB based "built-in" ledger was never intended for scaleable, production quality deployment. If this is true, corrective work on the built-in ledger implementation may be unnecessary. If so, the tigerbeetle based ledger implementation should be prioritised as a fix for these issues. However, discussion needs to happen about whether to include the built-in ledger version in official mojaloop releases, given it may not be clearly understood.

pedrosousabarreto commented 8 months ago

Hi @bushjames, here, in the CoA grpc server settings: https://github.com/mojaloop/accounts-and-balances-bc/blob/7c3a7877e21a18c0d21aaeec6a1f4478dc367463/packages/grpc-svc/src/application/grpc_server/grpc_server.ts#L88, above this, the loadbalancer layer also needs to assure active/passive - this is the way it was designed and it delivers the desired guarantees.

Please let me know if you have a test I can use to see it failing and investigate further.

I'm not aware of that mention regarding the "built-in" ledger. The system was designed and built so both the "built-in" and the TigerBeetle ledgers could be used in production quality deployments, with the obvious caveat that for larger and more performance usages, TigerBeetle should be used.

bushjames commented 8 months ago

Just for the record, I am making a note here that I think continuing discussion in the design authority is desirable to form a consensus position on the single-instance, MongoDB backed, built-in ledger implementation being suitable for production deployments in future. I will add it to the agenda for next Wednesday 2024-03-13 at 10:00 UTC. I will report back here with updates. Please reach out to me with any comments or concerns in the meantime.

karimjindani commented 8 months ago

Hi @bushjames - I read this comment in the thread that "MongoDB based "built-in" ledger was never intended for scaleable, production quality deployment" - I have few questions to understand the issue which we are trying to address.

Is there any benchmark for "Scaleable, Production quality Ledger"? If an objective performance test is carried out and the Ledger works well within the parameters of what most Mojaloop adopters would require for foreseeable feature then do we really need to change the Database technology at this stage?
In Mojaloop's current version, what type of Ledger is used? From what I have read so far, its also a set of services which are "built in". How is it different from vNext?

Regards Karim

bushjames commented 8 months ago

hi @karimjindani , I do not believe the "built-in" ledger has the parameters that most Mojaloop adopters will require in the foreseeable future. It has a fundamental limitation that only one process may service clearing requests for any given DFSP. This is an anti-pattern for high availability. The proposed load balancing solution does not solve the problem as it does not address this limitation. A hot-swap pattern is necessary whereby a redundant process would have to be kept ldle in case the first process dies unexpectedly. This introduces the unnecessary risk that is may not be able to handle requests; it may have lost connections to backing services etc... especially given that health checks e.g. from kubernetes pod scheduler may not be implemented sufficiently well (they are currently not) as to mitigate this risk. Regardless, the only way to truly have confidence that a pod can handle requests is if it is actively doing so. This is an important risk mitigation strategy for high availability. This is, in my opinion, a risky single point of failure that can and has already been eliminated with the existing active/active pattern. There is also the question of scalability; there is a limit to how much throughput any single hardware node will be able to deliver. Mojaloop made design decisions at the start of the project to favour scaling via the addition of low-cost commodity hardware nodes. I see no reason to abandon those design principles now in favour of a pattern that introduces a single point of failure risk into the architecture and also has the other aforementioned disadvantages over the currently implemented pattern. I am absolutely firm that these characteristics are contrary to high availability best practice design as required for nation scale critical financial infrastructure, which is what our adopters want and expect from Mojaloop.

mojaloop / project

Built-in Ledger Implementation: Facilitate highly available, strongly consistent, horizontal scaling of the core clearing algorithm. #3773