Closed JamesMBligh closed 5 years ago
transactionId
Making the transactionId
mandatory requires the data provider to maintain a permanent ID mapping per transaction, per consumer, per data recipient. This is a considerable overhead, especially when the 7 year retention requirement is considered.
It is also impossible for the data recipient to determine if a transaction has additonal detail available, triggering a large number of needless calls to GET /banking/accounts/{accountId}/transactions/{transactionId}
to determine if there is additional detail.
NPP adoption is not mandatory and seems to have been enabled by about half of ADIs. Among organisations that have adopted NPP, the substantial majority of transactions today do not occur through NPP, but through the card switch networks (ATMs and EFPOS), and batch systems.
Therefore the substantial majority of transactions will not have any additional data available via the GET /banking/accounts/{accountId}/transactions/{transactionId}
path.
It would be a mistake to add an overhead of this size to something that is relevant to only a small fraction of transactions.
We recommend:
This avoids storing the mapping of internal transaction identifiers needlessly, and it avoids having a data recipient call GET /banking/accounts/{accountId}/transactions/{transactionId}
for every transaction, just in case there are additional details.
Filtering
It seems to leave it up to the data provider to determine which transactions to include within the start-date
and end-date
range specified by the data recipient. Data providers may be in different time-zones to each other, the data consumers, and the end user themselves.
We recommend that date filters like start-date
and end-date
should be specified as datetime UTC to make it clear to the data provider what point in time the data consumer would like to see data for.
The inclusion of a text
filter defined as a "where this string value is found as a substring of either the reference or description fields" is an implementation concern. We could efficiently find transactions where the reference or descriptions starts with the text value, however a contains query cannot use a b-tree index. If retained in its current form, this requirement will be costly to implement. Additionally it should be made explicit that the consumer would probably expect this to be a case-insensitive query.
Ordering The standards should specify the order that the transaction data must be returned in. These are large data sets, and sorting is expensive, so we do not think that sorting should be a query parameter with multiple options that data providers must support.
Since the transaction data is created in order of postDateTime
, the start-date
and end-date
parameters are for posting dates, and many data consumers will be using the APIs to retrieve a delta of transactions that have occurred since the last time they checked, we believe that the order should be postDateTime DESC
.
Specific Areas Of Concern
Has the NPP extended information been modelled correctly?
The use of the term extendedData
when this applies only to NPP messages seems to be leaving the way open for this object to be used for other purposes as yet undefined. Additionally, NPP only supports the x2p1 overlay at present, with other overlays in the works. It is unclear what interesting data might be made available from other overlays.
It might be better to introduce a transaction$type
which can have the enum value npp-x2p1
currently, and then have an npp
object that has the from
, to
and extendedDescription
properties.
This could then be extended to include npp-x2p2
, npp-x2p3
, efpos
, and 'bpay' etc.
In terms of feedback on the detail modelled, for NPP payments from an account:
from
- this would be an accountId.to
- this can be a PayID or a BSB/Account Number, or a PAN.extendedDescription
- 280 character Unicode description.service
- for example "x2p1" is an Osko payment.For incoming NPP payments into an account:
from
- this would be a BSB/Account Number or a proprietary scheme. This could be an accountId for an internally held account, but not for an externally held account.to
- this would be an accountId.extendedDescription
- 280 character Unicode description.service
- for example "x2p1" is an Osko payment.For bulk transactions that are filtered and paged the payloads interleave transactions from multiple accounts. Is this the best way to represent this data?
We don't see any alternative way - if the transaction data was grouped under account objects, then paging would be broken, and it may as well be using the GET /banking/accounts/{accountId}/transactions
endpoint at that point.
We think your concern is symptomatic of a different issue. The bulk methods (GET and POST) have limited utility, and add complexity in implementation for data providers and data consumers. The bulk methods reduce round-trips, but since most consumers have very few active accounts, these methods are advantageous only for a tiny minority of consumers that have a large number of accounts. If a consumer has 3 accounts with 100 transactions in each, it is broadly the same if they are being retrieved 25 at a time with 12 calls to a bulk endpoint, or 4 calls each to 3 account endpoints.
We recommend that the bulk methods be made optional for data providers as they are in the UK specifications.
We would prefer to implement GET /banking/accounts/{accountId}/transactions
, along with ETag
or Last-Modified
HTTP headers so that data consumers can efficiently determine if there are new transactions available for a specific account.
Is this the extent of transaction fields to include? What additional meta data or fields would be valuable and viable to include?
Creditor/Debitor account details We would expect the creditor account details to be provided in the case of a credit transaction, and the debitor account details to be provided in the case of a debit transaction.
The UK specs include this information.
Without it we can't distinguish a transfer between the consumers own accounts, from a transfer between the consumer and an external party.
@da-banking the question of transaction ids is a difficult one. The lack of mandatory transaction IDs introduces unnecessary challenges for Data Consumers. Without immutable and unique transaction IDs Data Consumers are forced to request the maximum number of a user’s transactions in order to guarantee the data they are using is correct; alternatively the Data Consumer is having to create matching logic which potentially does not match up with the Bank's matching logic, and this could create inaccurate data and serious user detriment.
A compromise position might be for, as a micro service at the API layer, a bank to hash a series of fields which (combined) present a sufficient level of immutability.
@jh-a - we are comfortable including a hash if it is helpful. However, a hash can also be computed by the data consumer as well as the data provider. If the data provider computes it then the payload transferred between the data provider and the data consumer is increased. Leaving this to the discretion of data consumers would allow them to generate a hash only if they got value from it, and also make an appropriate trade-off between key-size and collision probability for their use case. So we would expect a hash to be added to data provider scope only if the hash needs to be computed from fields that are not in the payload.
The scope of Open Banking is for banks to make data available to consumers that the banks currently hold in a digital form. Adding immutable transaction ids in all cases is going beyond that, will be expensive to implement, and add performance overheads to operate.
We think the intention of the /banking/accounts/{accountId}/transactions/{transactionId}
endpoint is to facilitate other future transaction detail (besides NPP) being included at that endpoint, and in that context the endpoint makes some sense. This is why we have proposed a conditional transactionId
as a best attempt to support this.
We process millions of transactions daily. If each data consumer is retrieving transaction detail with a round-trip per transaction then this would generate significant overhead. These systems are designed to retrieve and display transaction history at the account or customer level - in line with the normal UX of a mobile/internet banking app, or a paper statement.
Our strong preference would be to eliminate the transactionId
property altogether and drop the /banking/accounts/{accountId}/transactions/{transactionId}
endpoint. Instead we would prefer to return the NPP or other detail in the transaction collection responses. We would also be open to the inclusion of an optional query string parameter include-detail= true|false
on transaction collection endpoints, so that the data consumer could specify if the detail should be returned or not.
The UK specs only have transaction collection endpoints. They do not define a resource at the granularity of a single transaction.
@da-banking thanks for the feedback. As to your points
I don't think it particularly matters where the hash is generated, provided the data provider can identify fields which, when hashed, demonstrate sufficiently immutability for the identification of transaction singularly - that is the challenge.
I don't agree with your point about equivalence between what is delivered to customers and open banking. An API is a machine interface, and thus must include characteristics which facilitate a machine interaction. Customers do not current receive json payloads, a range of headers in each message, consent ids - the list goes on but all of these features are necessities for the facilitation of machine interactions.
As to the performance overhead, my suggestion is predicated on the assumption that the ID (created by a hash of sufficiently immutable fields) would only be generated at the API layer by a micro service, when these transactions where called. Hence any overhead would be entirely dependent on the usage of any transactions endpoint.
I broadly agree that a single resource to return transaction IDs is probably superfluous, and that returning transactions as a collection with an ID in the body of each instance is a preferable and usable approach.
As to the UK specs (of which I am a joint author) the final version for adoption is at https://openbanking.atlassian.net/wiki/spaces/DZ/pages/641795939/Transactions+v3.0. The UK is currently considering approaches to the adoption of a unique and immutable transaction ID for posted transactions, following broad recognition of the challenge that transaction mutation presents to data consumers and customers alike. You can review the decision and supporting information at https://openbanking.atlassian.net/wiki/spaces/WOR/pages/720830465/CR-021+Read+APIs+Including+unique+and+immutable+transaction+ID+to+transactions
@jh-a we are somewhat at cross purposes.
We were not suggesting there was an equivalence between what is delivered to customers and open banking. Just that where there is a deviation, we're heading into broader architectural changes, trade-offs, and incurring higher costs. As a rule of thumb, endpoints/payloads that align with a current user interaction will be easier for banks to support than novel ones, and banks are more likely to have the data proposed.
We don't have a unique immutable ID for each transaction, adding one is do-able, but non-trivial. We understand how useful this would be. We could use it too :-)
We don't have an analogous process for retrieving individual transaction records by a unique immutable ID. Adding this introduces performance trade-offs to other important existing processes that banks perform, so we're not keen to implement that without a good reason, and we've suggested an alternative that side-steps this concern.
So to summarise, we can have a transactionId
that is unique and immutable and is fulfilling the role of the hash you suggested for merging, but we do not want to support GET /banking/accounts/{accountId}/transactions/{transactionId}
and would prefer this detail was included on the other transaction history collection endpoints.
@da-banking great! we're on the same page!
S/RC/RD (181026-1)
We note that description, reference and a number of the NPP fields may contain sensitive information including e.g. medical information, place of employment or contact details. These details may be the details of a party who did not give consent but was entered by the consenting customer. Careful consideration needs to be given to security scopes and the consent process.
In regards to bulk transactions; there is unanimous objection among the Financial Institutions to the implementation of a bulk account and transaction endpoint. The processing impact of aggregating the collective portfolio of a customers accounts and their subsequent transactions has the potential to cause large performance issues. To technically source, compose and return the size of the payload will far exceed the potential response requirements and greatly degrade the customer experience. Further, the presence of a 'bulk' method moves the onus to the Financial Institution to perform reporting and Personal Financial Management functions in lieu of the Third Parties who can create and manage these within their own customer base. The view of the ABA Working Group is for the Bulk Endpoint to be removed from the proposal and delivery of the July 1st 2019.
The ABA Open Banking Technical Working Group notes and supports Westpac's comments above re runningBalance.
NAB is broadly supportive of the intent of this proposal, the designs and structures are well considered, however implementing all the proposed end points as they are described will be a significant challenge, to compensate, the following amendments are proposed:
Privacy
From a Security and Data Privacy perspective, a transaction's reference and extendedDescription attributes will most likely contain personal information about customers (including potentially 'sensitive information' as defined in the Privacy Act). Sensitive information is defined to include sensitive data such as health data and as such NAB is concerned that the reference / extendedDescription of transactions could include this sensitive health related data (for instance these fields may provide information about a consumer's visits to their doctor). Under the Australian Privacy Principles (APPs) where APP entities handle 'sensitive information' the APPs impose more stringent obligations, and as such should not fall within scope of a yet to be tested data sharing eco-system.
Additionally, real-time payments with extended remittance information is now possible allowing people to send payments to their friends/contacts at anytime, in a more conversational in-the-moment way. We believe that sharing this information is not appropriate and may create more friction it getting wide spread user trust and adoption within the scheme; as people may be less inclined to share any data knowing that this sensitive data could be included even for the simplest of use cases.
Given the risks associated with sharing this data we explicitly object to the inclusion of reference and extendedDescription attributes.
Data
End Points
For the reasons stated above and throughout this thread, transactionId is currently NOT available for all transactions and as such we recommend de-scoping the following end point entirely, in favour of the compromised alternative discussed above.
/banking/accounts/{accountId}/transactions/{transactionId}
Retrieving transaction history across multiple accounts i.e. "GET /banking/accounts/transactions/" also presents its challenges especially when considering the following statement:
`**account-category** - Used to filter results on the accountCategory field. Any one of the valid values for this field can be supplied. If absent then all accounts returned `
In this scenario a corporate customer with 100+ accounts and thousands of transactions per account would still need to be returned, coupled with the other query parameters this end point can easily become exceedingly complicated. Not withstanding is already a challenge for a retail customers with only a few accounts.
We believe the intent of the " POST /banking/accounts/transactions/" end point was intended to address the problem articulated above in so far as we could limit the number of accounts that could be requested, however we believe that it too is still too complicated to be included within the phase 1 scope. As an industry we may be better off implementing fewer end points well, learning from and refining them before attempting the advanced use cases.
There is still too much uncertainty surrounding how the scopes and entitlements would facilitate which accounts a user may see or not with respect to these aggregated end points.
In summary we support:
- GET /banking/accounts/{accountId}/transactions (with stated modifications)
- GET /banking/accounts/transactions (optional for data providers)
We recommend de-scoping:
- GET /banking/accounts/{accountId}/transactions/{transactionId}
- POST /banking/accounts/transactions (delay, and leverage learnings from the associated GET end point)
Filters
Filtering is too complex, we recommend a simple date range filter set only per account. Data providers can optionally support the other proposed query parameters I.e. min-amount, max-amount and text. TPPs will likely attempt to aggregate data from multiple financial institutions in any case, therefore, when their customers search their transaction history they probably want to search for a specific text field across accounts from multiple banks, this would entail having to build a lot of this functionality once again, thus making this additional complicated search capabilities redundant from a data provider point of view.
The query types posited in the API, including sorting, amount ranges, date ranges and more general description search, imply establishment of ‘analytical services’ over transaction data. Such value-add services may be provided by a TPP, and indeed we expect parties to create such services. In contrast, we understand the role of the data provider to be to establish the ‘data services’ which are able to supply the data so that said TPP’s can build their value-add propositions.
To that end, and to expedite the supply of the data, we recommend:
API design
CommBank feels it may be inefficient to have a mandatory call to endpoint /banking/accounts/{accountId}/transactions/{transactionId}
for fetching additional details of a transaction. A suggested alternative would be to use /banking/accounts/{accountId}/transactions
endpoint to return both basic and additional details of transactions, with an input flag indicating whether more details are required or not. The additional details will be optional fields, and should be provided by data provider if exist.
Pagination
The usage of page
and page-size
in the request, together with the mandatory totalRecords
, totalPages
, and links
in the response might not be ideal because:
· It implicitly assumes that the transaction list does not change while it is being queried. If there are new transactions in between the queries, they will shift existing transactions to the next pages, which will then be returned again when users query the next page.
· It forces a full transaction prefetch and count before returning any records. It is not efficient, especially when you are dealing with potentially up to 7 years history.
An alternative approach is to leverage a page token or cursor. The request needs to specify the page-size
, and an optional next-page-token
argument. On the first request, the next-page-token
will be empty. In the response, if there are more transactions (more pages), a next-page-token
is returned. Include this next-page-token
in the subsequent request to get the next results. Keep doing this until there is an empty next-page-token
. This ensures the result contains the full transaction list, excluding any new transactions happening when the first query was executed.
Bulk Transactions
The proposed ability to serve an aggregated/bulk view of transactions (/banking/accounts/transactions) to an end consumer may prove computational expensive and as a result technically difficult to implement. Commercial entitles may have hundreds of different accounts, and there is no mechanism is downstream systems to proved a sorted, and combined view for a specified customer. This data would likely need to be materialised just prior to the API gateway, introducing latency, and potentially introduce data quality issues. CommBank would suggestion this request is removed, and potential use cases are solved by aggregation on the client/TPP side.
Data Fields
We are aligned to the view shared by @WestpacOpenBanking in regards to the runningBalance field. We note the earlier justification of including a running balance based on the presence on a statement document. CommBank does not recommend using statements as a benchmark for what fields are included in the payload, due to the process of manufacturing statements being conducted as a batch process on a provided time-period, and is a factual representation of a balanced journal.
CommBank also suggest there should be a structured way of demonstrating whether a transaction is a debit or credit on the account. Currently a decimal value is not determinative, and as such an additional property may need to be included.
Additionally we share similar views to @NationalAustraliaBank and @WestpacOpenBanking in regards to the potential data leakage of PII through the transaction reference field, and would caution that safeguards are put in place to ensure that consumer privacy protections can be enforced.
With respect to the references to “Reference/String/Optional”, which refers to a bank reference, we suggest that this be “conditional” or “mandatory”, as each originating bank transaction would have a reference.
I'm about to close the feedback period. Comments on feedback to date that will steer the final decision (for the draft standards) is below:
Transaction ID Feedback around making this conditional is understood. This change will be added. As the Transaction ID is useful to resolve the issue of transactions altering during paging and general de-duping so transaction calls do not need to be repeated it should not be used to indicate if more info is available. I will add an optional "isDetailAvailable" flag for this purpose.
Filter Dates Good point regarding timezones and which date is specified for filtering. I will align query parameter dates to match DateTimeString format. This will make it even more specific and will also allow for timezone specification as that is part of that format. I will also fix the “after this date” language for end-date. That was a copy/paste error.
Ordering Agree to specifying date in descending order. I will update accordingly.
Running Balance The argument that this field is of minimal use with an API that can be filtered and paged is valid. I will remove it for now.
CR/DR The suggested language “a negative value indicates a reduction of the available balance on this account” will be included. Before adding a CR/DR field I would like to understand specifics around situations where a postive or negative sign on the amount field is not sufficient.
Account Category The use of account category on the filter for bulk APIs is deliberate as it matches the account list filtering. The idea being that I select a list of accounts and then bulk retrieve the balances or transactions for that filter. The actual category field is not included in the payload as it can be retrieved via the accountId and presumably is already (or can be) cached by the client by calling the account list API. The purpose of the payload is for transaction data, not account data.
Mandatory Reference OK. the reference field will be made mandatory.
Positive Integer OK. I'll change the PositiveInteger common type to exclude zero and add NaturalNumber as a common type. This will result in type changes across a number of API end points.
Pagination Page immutability was called out in the pagination proposal as not being required. This is at the discretion of the client to manage acknowledging that it will be a pain. The reality that transaction lists are changeability is simply a fact of life for this data set.
Bulk APIs The feedback that the bulk APIs are of limited use is, itself of limited use. The feedback here is primarily provided by organisations aligned with the banks. Feedback on whether these APIs are of use is more appropriately provided by the data consumer community. I think the feedback also fails to acknowledge that, if these APIs are not supported and the assumption of limited use fails, data consumers will end up calling the account specific end points more often than required. That leaves performance as a key concern which, while entirely valid, is better managed via NFRs rather than making the end points optional. Making end points optional will simply mean that they are unlikely to be implemented widely meaning data consumers are unlikely to utilise them in client implementations. Until there is more broad feedback from non-bank stakeholders these end points will remain in scope and mandatory.
Transaction Detail There is conflicting feedback here. On the one hand there is concern around performance and the sensitivity of the additional data available via transaction detail (which is mainly NPP related). On the other hand there is a suggestion to fold this API into the transaction list end point.
The purpose for keeping a separate end point was three fold:
As such, for the draft proposal, the transaction detail end point will remain.
NPP Specifics NPP specific notes:
Thanks all.
-JB-
The finalised decision for this topic has been endorsed. Please refer to the attached document. Decision 028 - Transaction Payloads.pdf
-JB-
This decision proposal outlines a recommendation for the payloads for transactions as per the end points defined in decision proposal 015.
Feedback is now open for this proposal. Feedback is planned to be closed on the 26th October. Decision Proposal 028 - Transaction Payloads.pdf
Please note the specific concerns. If there is significant early feedback I will reissue with amendments prior to the closure date.