Open MichaelJBRichards opened 1 year ago
Initial thoughts. CUID2 is new meets all our requirements, and it's adoption since January 2023 is growing quickly. A contender that I think we should consider is NanoID. NanoID
Because of the difference in character length, it would be possible to support both the new chosen CorrelationID and the existing one. Should FSPIOP support both CorrelationIDs?
FSPIOP could support both formats; but ISO 20022, as remarked, can't, due to the field length of ISO identifiers. If we want to to retain compatibility between the two standards, we would need to drop UUIDV4.
As to CUID vs. NanoID, I am happy to leave that discussion to technical experts.
Thanks @MichaelJBRichards, I'm fine on a high-level to move to CUID2 instead of UUID in version 2.0 of the FSPIOP API. It seems like you can customize the length of your CUID2, so we can choose a length that suits others such as ISO as well.
As discussed in today's SIG meeting, the proposal is to move to CUID2 in version 2.0 of the API, to have a more modern industry standard for unique IDs that can fit in the ISO 20022 data model.
In the background of this issue, it says. The current implementation of the ISO 20022 data model limits the length of a unique identifier to 35 characters. UETR (Unique End to End Transaction Reference) is 36 characters and supports UUID v4 -
That is true of the UETR, but not of any of the other identifiers used in the ISO 20022 messages. However, we have an agreed alternative for this.
In the last DA session I was provided reference. To the ticket and I saw the first sentence which said the statement that ISO20022 data model limitation, which isn't true. We can and should use UETR as the end to end unique tracking reference as this is suggested in the latest (2023) paper by BIS on harmonised requirement for cross border payments. It supports holding UUID 4 format and as such we don't need to shift to any other format.
All of this is, of course, completely true as regards the UETR, Karim. The problem is that the UETR is the only identifier which is defined in this way, and there are several API endpoints where we need more than one unique identifier. For instance: POST /quotes (quoteId, transactionId, transactionRequestId). So there is no problem with the UETR: the problem we need to solve is with the other identifiers which we need to include.
https://github.com/ulid/spec is an alternative suggestion for performance reasons while still also keeping the character limit for ISO 20022.
UUIDv7 has the drawback that it is still 37 characters, i.e. not possible to use in ISO 20022.
Note that there have been related discussions in #131, including additional information on performance in https://github.com/mojaloop/mojaloop-specification/pull/131#issuecomment-2147059300.
Note that there have been related discussions in #131, including additional information on performance in #131 (comment).
Here is a summary of the discussion:
A disadvantage of CUID2 is related to performance impact in DB and other storages that need to index the id:
The readme for CUID2 contains a Note on K-Sortable/Sequential/Monotonically Increasing Ids, which recommends the use of createdAt
fields. The problem is that this is not possible to do for many of the ids in Mojaloop without quite substantial effort, as in many cases the generated ids are primary keys and any non-sequential id generators lead to quite bad performance when data is accumulated. Avoiding this issue when non-sequential ids are generated will require a lot of effort in restructuring the database and probably reworking the table lookups to use a time range, as the primary key must be changed. The changes might even affect some of the logic of the flows. The issue is probably related to not just the SQL database, but also other places where we are likely to store and index the data, like log aggregators, etc.
The main claim of CUID2 vs sequential ids is the leak of timestamps, but this is just a generic claim and we should consider that many of the ids we generate are only significant for a short period of time, given the real-time nature of the functionality they are associated with. So this "leak" is not really an issue in our case. This leak is only significant when associated with entities that are not so much real-time related, like account creation, customer creation, etc. Instead of worrying for a leak, maybe better think about improving the logic and restrict any operations that relate to IDs outside of a certain timeframe.
The section also recommends the use of cloud solutions and in-memory databases, which is not the inclusivity we are working for and their use is often restricted by regulation. I think the allowed id generators should not be so restrictive, as for example the most important thing we want to restrict is the length, and even this can be probably parametrized during DB creation. Accepting only CUID2 will feel like a win for the cloud providers, not for inclusivity.
Some possible alternatives for monotonically increasing are:
I think multiple approaches should be considered depending on the particular use cases:
Finally it is best to write the software in a way that allows the used IDs to be configurable and agreed within the implementation. So implementations dealing with ISO 20022 should not enforce other implementations with complex and expensive ID requirements.
hi @henrka when you get a chance, maybe in one of the next meetings, lets capture the FSPIOP SIG decision on this here so that the DA can reference it. Thank you!
hi @henrka when you get a chance, maybe in one of the next meetings, lets capture the FSPIOP SIG decision on this here so that the DA can reference it. Thank you!
Let's do a formal decision in meeting on Thursday.
FSPIOP API SIG has decided to change the correlation ID to ULID, starting from version 2.0 of the API.
Open API for FSP Interoperability - Change Request
Table of Contents
1. Preface
This section contains basic information regarding the change request.
1.1 Change Request Information
| Requested By | Michael Richards, Infitx | | Change Request Status | In review ☒ / Approved ☐ / Rejected ☐ | | Approved/Rejected Date | |
1.2 Document Version Information
2. Problem Description
2.1 Background
The current implementation of the ISO 20022 data model limits the length of a unique identifier to 35 characters. This is (just) not sufficient to hold a UUIDV4, which is the current Mojaloop standard for a Correlation ID. We therefore need to change something. The following alternative proposals have been discussed in the ISO 20022 SIG:
A summary of discussion on these points:
2.2 Current Behaviour
Explain how the API currently behaves.
All correlation IDs are specified as UUID identifiers, and are defined as instances of the BinaryString32 data element. This is a fixed-length string which can contain alphanumeric characters and hyphens
2.3 Requested Behaviour
Explain how you would like the API to behave.
The ISO 20022 SIG has proposed moving to the CUID2 standard for the generation of UUIDs. This standard appears to offer:
Most of the work required to implement this change will be on the APIs, and I shall be raising an issue on the FSPIOP API to consider this; but I wanted the DA to consider it from a technical perspective first.
3. Proposed Solution Options
Change the data type of the CorrelationID element (Section 7.3.8 of the FSPIOP specification) to be a new data type. This should be a restriction of the existing BinaryString data type (Section 7.2.17 of the FSPIOP specification) which has a fixed length of 32 characters and permits only lower-case alphanumeric characters.