ConsumerDataStandardsAustralia / standards-maintenance

This repository houses the interactions, consultations and work management to support the maintenance of baselined components of the Consumer Data Right API Standards and Information Security profile.
41 stars 9 forks source link

Running balance available under transaction detail #553

Open jimbasiq opened 1 year ago

jimbasiq commented 1 year ago

Description

The Balance for an account is available via https://consumerdatastandardsaustralia.github.io/standards/#get-account-balance

This change request is for a running balance attribute to be added to the transaction object.

This is needed for example:

  1. If a data recipient wants to see a point in time balance
  2. If a data recipient wants to see an average balance over a defined period
  3. If a data recipient wants to predict a balance flow over time

Area Affected

Data returned by a GET /banking/accounts/{accountId}/transactions request response.

Change Proposed

The change would be the inclusion of a runningBalance of type BankingBalance on the ResponseBankingTransactionList.

perlboy commented 1 year ago

This isn't actually data that is stored in many source systems, particularly in the location that stores individual transactions. Instead it is often computed on presentation often from a daily checkpoint of current balance which in the context of list transactions is problematic because the endpoint is not bounded to a day. This request therefore seems to be asking for what is, in essence, a materialised view to be created and would effectively force holders to implement a data transform/store layer they may not be able to afford.

Further notes:

This proposal seems to be adding derived data for specific use cases (ie. it's degeneralising the data) that seem like they are better solved and more appropriate within the Recipient space.

jimbasiq commented 1 year ago

The assumption here is all Banks are capable of doing this as they currently do it on their online banking and their mobile apps. I would be surprised if this is being done in the front end systems.

To clarify the use cases:

  1. If a data recipient wants to see a point in time balance without having to call the get balance api multiple times or once that point in time has passed
  2. If a data recipient wants to calculate an average balance over a defined period
  3. If a data recipient wants to predict a balance flow over time e.g. what is the expected (average) balance on the 25th of each month
ghost commented 1 year ago

Hey @jimbasiq! In our case it is true that the running balances you see on statements or in Internet banking are calculated by the CBS as required. Other's may be different, but that's how ours works :-).

What is the granularity required for your use-cases? Would the ability to get end-of-day balances for specified dates be sufficient? (E.g. by specifying date windows when calling the get balance endpoint). Again, this data is not stored simply like this and would need to be calculated/generated, but may be more efficiently processed. Maybe... ;-)

rob-hale commented 1 year ago

You beat me to it @mattp-rab - nicely done 👍 I get why you'd assume what you did though @jimbasiq - it seems completely logical. However, Internet Banking and mobile bank apps only need to display a page of transactions at a time so calculating the running balance on the fly in the FE is not such a big overhead. It also gracefully accommodates reversed or retro-fixed transactions with ease. Probably worth bearing in mind too that the data source for CDR data will not necessarily be the same source as that used to serve mobile and IB apps. Banks have had to get a bit creative for CDR - some use an ODS to improve performance, so even if it were available for a mobile app, it might not be a trivial exercise to include a new data element for CDR. A final little nugget is that there are actually two balances for many accounts - the current balance and the available balance - I'm sure that will surface at some point too if it hasn't already... I think I get your use case though - for example, finding the best (most reliable) day of the month to make a payment based on a profile of the average balance of an account - it's a useful one to improve payment success / confidence. Would love to know if other banking DHs actually store the balance against each transaction... anyone?

jimbasiq commented 1 year ago

Thanks both. It pains me to see a situation where a job is being done in multiple places (with possible differing results) when it could be done once at source. Just because this has been the pattern to date, should we persist it ¯_(ツ)_/¯

We do have some challenges if we were to calculate on our side. e.g. Some use cases require historical balances, for example "what was the balance prior to the salary payment 1 year ago". We can Get Transactions For Account with a date range to find the salary transaction. However ,to find the balance at that time we will need to get the current balance then make multiple calls to Get Transactions For Account to get all the transactions from now to 1 year ago. Rather than making 1 service call we are making multiple, pushing up already strained TPS limits.

The other way to solve this is to this problem is to put a date/time on the Get Balances For Specific Accounts.

ghost commented 1 year ago

@jimbasiq consistency and source of truth are really good reasons why this should be done by the DH if it is to be done anywhere. I like it.

The other way to solve this is to this problem is to put a date/time on the Get Balances For Specific Accounts.

👆 This was my question above - would this approach meet your use-case requirements??? It is less granular and therefore potentially easier/more efficient to calculate/generate.

rob-hale commented 1 year ago

This is making sense @jimbasiq and @mattp-rab... I was originally wondering if the ADR could just collect account balances each day and store them, knowing the timestamp of that action, but that creates problems because of data minimisation and also if it's a new authorisation, there is no history of what the balance was a year ago until we've been doing this for a year. So I think your suggestion of being able to request a balance for a specific historical datetime could be a useful way of meeting this need. Not sure how hard it will be for all DHs to implement, but it feels easier than stamping every transaction with a balance and all the complexity that may bring. Lets face it, if this data is needed to support a popular use case, the market will use whatever means are currently necessary and as you point out, today that means more TPS which isn't ideal.

WestpacOpenBanking commented 1 year ago

Westpac does not support this change as the raw data required for the calculation of running balance is already available to ADRs via existing APIs.

Westpac requests that DSB considers the economic benefit vs costs to the ecosystem when receiving requests for additional data to be provided from Data Holders. Data that can be derived may benefit specific use-cases, but can also be calculated or solutioned by ADRs developing those use cases. Expanding mandatory data fields to be provided by Data Holders for data easily derived has high costs of implementation across the ecosystem.

NationalAustraliaBank commented 1 year ago

We echo Westpac's point of view on this item and thus do not support this change.

Further, we also support Westpac's opinion on weighing up economic benefits vs cost to the ecosystem when new CRs are being raised.

DougFromPayPal commented 11 months ago

PayPal is not supportive of this change and is aligned with the concerns posted by Westpac and National Australia Bank. In addition, PayPal is a Purchased Payment/Stored Value Facility, that uses a customer’s digital wallet to facilitate payment transactions via online and brick/mortar business entities. Although PayPal account holders have the ability to ‘store value’ in their digital wallet account, the vast majority do not and rely on linked funding instruments (credit card, debit card, bank account) to fund each payment transaction. Even with a stored balance available, a PayPal account holder can choose to fund a transaction in full with a linked account, in full using their PayPal Digital Wallet balance, or partially fund it using both the available balance and a linked account. A PayPal account balance is unlike a typical bank account and the concept of a running balance is not applicable or relevant in our business model. Keeping and/or calculating a running balance, on demand, is not a function currently supported in PayPal. PayPal agrees with other comments that this warrants further investigation in weighing up economic benefits vs cost to the ecosystem.

markskript commented 11 months ago

I can appreciate the issues brought up by the ADH's here - but I also want to call out that it's essentially impossible for the ADR's to generate a running balance due to the inability to track payments across the PENDING and POSTED barrier, and also the fact that a few large ADH's seem to randomize the transaction ID every time a transaction is updated, leading to duplicates which we cannot resolve.

perlboy commented 11 months ago

Responding specifically to some of the Recipient comments here and it's a bit chronological cause I dropped from this convo.

The assumption here is all Banks are capable of doing this as they currently do it on their online banking and their mobile apps. I would be surprised if this is being done in the front end systems.

No. They aren't. In IB they are providing a balance according to the ledger not according to the actual balance that could be drawn down on an account. For instance, credit card holds, unreconciled transactions, reversals, payment clearings etc. all of those aren't on the ledger and many (most?) organisations either show them as standalone "Pending" without running balances or in the transaction list with balances redacted.

A ledger balance is not the same as an available balance.

Thanks both. It pains me to see a situation where a job is being done in multiple places (with possible differing results) when it could be done once at source. Just because this has been the pattern to date, should we persist it ¯(ツ)

The CDR is about sharing data organisations already have. It might "pain" you but the alternative is effectively Recipients outsourcing their problems, at zero cost, back to Holders.

  1. If a data recipient wants to see a point in time balance without having to call the get balance api multiple times or once that point in time has passed

There's no reason to not call the Get Balance API multiple times. If the desire is to be able to call it more often then that's a discussion to be had around NFRs etc.

  1. If a data recipient wants to calculate an average balance over a defined period
  2. If a data recipient wants to predict a balance flow over time e.g. what is the expected (average) balance on the 25th of each month

These both seem like high school mathematics problems especially since the ADR has access to 2 years worth of data.

I note that ADRs seem to be deliberately retrieving the full 2 years regardless of use case so arguably this is a problem they can already solve from data they have already downloaded. It remains to be seen if this meets the data minimisation privacy safeguards (2 years of history seems appropriate for a house loan not a pay advance or <$2500 loan).

We do have some challenges if we were to calculate on our side. e.g. Some use cases require historical balances, for example "what was the balance prior to the salary payment 1 year ago".

Play the transactions back in reverse.

We can Get Transactions For Account with a date range to find the salary transaction. However ,to find the balance at that time we will need to get the current balance then make multiple calls to Get Transactions For Account to get all the transactions from now to 1 year ago.

That is correct and you can do it 1000 transactions at a time. For what it's worth our observation is that a typical Consumer rarely has more than a few thousand transactions in a year. There is grounds here to instead consider an async approach (i.e. "dump all of it and let me know when it's ready") as an alternative and that's something CDR+ has some initial conversations going on around.

Rather than making 1 service call we are making multiple, pushing up already strained TPS limits.

I don't think introducing functionality through attributes to solve for throughput challenges is a solution, all this is doing is moving the problem further into the banking architecture. It might work in some cases but what is also almost certain is velocity of change will dramatically slow down. This is, in essence, the whole reason for existence of many Fintech organisations (i.e. high velocity).

The other way to solve this is to this problem is to put a date/time on the Get Balances For Specific Accounts.

If this is a suggestion to request a balance at a point in time it is, again, moving the burden into the stack not solving for it.

I can appreciate the issues brought up by the ADH's here - but I also want to call out that it's essentially impossible for the ADR's to generate a running balance due to the inability to track payments across the PENDING and POSTED barrier,

This seems to imply Holders actually can do this without fundamental changes at very high cost to numerous backend systems (tl;dr: They can't). The CDR isn't meant to solve all the underlying challenges in banking. There is a statement to be made here that Recipients need to "get with the program" of the challenges holders face every day. Opposition to this proposal isn't really a Holder vs. Recipient situation but most likely technologist(s) in the Holder side saying "you don't understand the full problem space".

and also the fact that a few large ADH's seem to randomize the transaction ID every time a transaction is updated, leading to duplicates which we cannot resolve.

So this might be a compliance issue but it probably won't be. Taking the PENDING and POSTED situation, in quite a number of deployments PENDING lives in a completely separate part of a banks infrastructure (i.e. the part attached to the payment rails) and is then "broadcasted" to the downstream ledgers which then reconcile it as POSTED. This means that from a banks perspective the transaction identifier is different, the first representing the transaction id in the payment processing ledger and the second representing the reconciled version of it in the core system. I'll note here that in the CDR and for NPP payments this seems to be a solved problem with endToEndId so there may be a question around whether it is suitable (I'm not an NPP expert) and whether organisations are supplying it.

The NFRs of the Standards force organisations to concurrently broadcast these to a third system (ie. an operational data store or similar) which itself has no way of deduplicating such things to provide a single transaction identifier. In internet banking situations (ie. "other digital channels") this is completely fine because the user typically sees "Pending" transactions often without identifiers and this can be pulled from the payment rails ledger (that gets flushed after reconciliation).

joshuanicholson commented 11 months ago

We understand the request for a running account balance and would use it should it be made available. However, we also appreciate the feedback and issues raised by the Data Holders, as a balance is a calculated value rather than a stored value.

We feel this is fundamentally a data quality and compliance issue, as we have data integrity issues with balance & transaction data collected from DH’s. This means calculating the balance backwards from the current balance is proving problematic and essentially impossible for some DH’s.

A rather simple example can be demonstrated as follows (appreciate a couple of assumptions in the following example, but as we know, without clarity of the specification and consistent delivery of data, sometimes assumptions must be made).

Let's say two API calls are made within milliseconds of each other; a get balance and get transactions.

1) The Balance call returns an available balance of $80 and a current ‘ledger’ balance of $100 2) The Transaction call returns many transactions for the required period (28 days) and includes two ‘pending’ transactions that have a net value of $35 a) One pending transaction is $20 to Big Telco b) Second pending transaction is $15 to Best Friend

So given the above pending transactions and a $20 difference between the available & current balance, how can this mismatch be explained? (Ignoring things like credit card & overdraft limits, just an everyday transaction account) Based on this simple example, should a human review the data, they would suggest that, in fact, there is only a single pending transaction as the $15 transaction is, in fact, posted and included in the current balance (suggesting the DH’s API for balance & transaction are not synchronised). Alternatively, the available balance should be $65. This also means any attempt to reverse engineer a running balance will yield incorrect information.

Sure, the above could be done programmatically, but we see examples where it is impossible to identify transactions that match the difference between the available and current balance. To make matters complicated should this be a credit card and the limit not be provided, any calculation/reconciliation becomes impossible. This lack of data quality and noncompliance is forcing us to seek more data points to assist us in reconstructing the ‘ledger’ of a consumer's bank account.

Is the answer compliance enforcement? (we say yes), should there be a change to provide more data, as an ADR we'd never say no to more data if it ensured data integrity.