ConsumerDataStandardsAustralia / standards

Work space for data standards development in Australia under the Consumer Data Right regime
Other
318 stars 56 forks source link

Decision Proposal 022 - Paging #22

Closed JamesMBligh closed 5 years ago

JamesMBligh commented 5 years ago

This decision proposal outlines a recommendation for paging of APIs that return multiple records.

Feedback is now open for this proposal. Feedback is planned to be closed on the 21st September. Decision Proposal 022 - Paging.pdf

da-banking commented 5 years ago

We can accept the recommendation as is.

We have some concern about the 500 limit as being somewhat arbitrary.

It is less work for a data provider to serve one large request than many small requests. Data providers will want to set throttle limits to ensure stability and availability of the whole service, and then push the data out as fast as possible within those constraints.

If data providers were permitted to return a page size larger than 500 at their discretion, it doesn't seem to impact the standards in any negative way.

The small default of 25 records seems sensible, so that data consumers have to opt-in to making data providers do more work.

bazzat commented 5 years ago

The consensus of the ABA Online Banking Technical Working Group is that we're supportive of the recommendation, with some provisos.

We suggest that the random access provided by the query parameters pg and pgsize is not needed - this is better achieved using filter parameters and possibly a parameter for ascending/descending order - typically a data consumer would want to specify a date range rather than implement a dumb search.

The value of the meta fields and last link fields are also hard to see in the context of use cases where filtering is implemented. These add implementation complexity and latency and do not have our support.

For these reasons we believe that it would be better if the standard specified that ASPSPs SHOULD (instead of MUST) support the query parameters, meta fields and last link field. This may avoid a situation where an data holder needs to decide between meeting non-functional requirements around latency and pagination requirements for July 1. E.g. HTTP 501 Not Implemented could be returned if query parameters are not supported.

deboraelkin2 commented 5 years ago

I agree with pagination being included as part of the spec. I'd recommend adding a "prev" field in the links object with similar purpose to "next": URI to previous page. Should be included unless current page is the first.

speedyps commented 5 years ago

I agree with the way pagination is described in the spec.

Even with filter parameters available, a call could request a volume of data which will create unnecessary load for the API provider, and delays on getting responses back for API consumers. Paging is a good way to mitigate a bad call that could create issues for all. Note, the data being provided via the APIs will be for businesses as well as consumers.

Ie I am the owner of a supermarket, who processes on average 200000 transactions a day. A call for a single days transactions without paging may create load for the API provider which they may not be happy with, and could have the API consumers waiting for say 3 minutes for data. Timeouts will have to be adjusted to make sure connections are kept open to retrieve the data. If paging is in place, then it would be broken into pieces which allow both sides to have acceptable performance. The API consumers would just have to do slightly more work (ie make 400 calls to get a days transactions). What if the API consumer makes a bad choice and calls to get a weeks worth of data.

Paging is a common approach in the UK OpenBanking as well as under DDA in the US.

I have worked with Banks in the US where the upper page size of 1000 was used. Providing the ability for the API consumers of the data to control the size of the pages they want is also a useful criteria.

WestpacOpenBanking commented 5 years ago

Westpac is supportive of the proposal in the case that it specifies that Data Holders SHOULD (rather than MUST) support the query parameters (pg, pgsize), meta fields (totalRecords, totalPages) and last link fields. For Westpac, the requirement to include all the specified fields would result in an unacceptable response time to return the first page of results. The date filtering and order parameter approach outlined above is more suitable for enabling random access use cases, and we endorse that approach.

TKCOBA commented 5 years ago

COBA broadly supports Data61’s recommended approach. With respect to the recommended Additional Rules, we would appreciate clarification around how the maximum page size of 500 records was determined. Taking customer transaction history as an example, it is not clear whether the recommended maximum would be sufficient.

Pelsurry commented 5 years ago

We agree with @da-banking and @TKCOBA that 500 limit seems arbitrary. It's not clear why there needs to be a fixed limit given the trade-off described in @speedyps.

500 might be sufficient for use-cases involving an individual consumer, but it is easy to envisage use-cases for small to medium businesses that require records greater than 500. Keep in mind that the UK system only covers bank accounts held by individuals and small businesses. It is proposed that the Australian regime be open to all account holders (ie including medium to larger businesses).

A high limit could result in API consumers waiting for longer periods. This could be unacceptable for use-cases returing only small datasets (eg a consumer requesting their saved payee list). But for use cases that might involve larger record sets (eg feeding business transactions into an accounting system), this delay might be acceptable.

It might be more appropriate to give API providers flexibility and control acceptable response times via standard service levels in the rules published by the regulator (asssuming that is the purpose of setting a hard limit).

NationalAustraliaBank commented 5 years ago

This pagination proposal is aligned to what we would expect from a NAB point of view. With a few minor modifications/clarifications to the proposal:

  1. Query parameters for paging could be more human readable:

    • We prefer 'page' to 'pg'
    • We prefer 'pageSize' to 'pgsize'
  2. Can more clarity be provided for the following statements:

'Each end point that can return multiple records will stipulate whether pagination is supported for the end point or not"

Question: Is pagination mandatory as per each specific end point and will be called out in the specification? OR is it up to the discretion of each provider to make this determination, and for them to stipulate if their implementation's end point supports pagination or not?

"In addition to the data requested a provider MUST also provide the following additional information in the response payload"

Question: In the case that only one page is available will all the page meta data still be required?

JamesMBligh commented 5 years ago

My response to feedback is as follows...

Paging vs Filtering It was expected that end points would provide filters as well as pagination. The existence of pagination support does not preclude the use of filtering (or vice versa).

Parameter Names Sure, happy to change them to be more readable.

Mandatory Or Not The proposal indicates that the use of pagination for a specific end point will be nominated in the documentation for that end point so a case by case decision can be made.

Page Size Limit The 500 limit was entirely arbitrary. The UK used 1000 which I thought excessive but I'm happy to align with that standard. The reasons for the limit are as follows:

Meta Data The meta data fields would be required as per the logic documented. Even for the last page it should be needed as these fields would allow the data recipient to know that this is the last page (bearing in mind data may change between invocations).

Prev Page D'oh! Of course this should be present. If I missed it then it was an oversight.

Concerns Over Performance I am concerned about the perception that paging of data would create a performance problem since it is a truncation of a data set that would otherwise be provided in full. I would like to understand this concern more - perhaps at the in person meeting on the 4th.

-JB-

JamesMBligh commented 5 years ago

I have now closed consultation on this decision. A recommendation incorporating feedback will be made to the Chair in due course.

-JB-

JamesMBligh commented 5 years ago

The finalised decision for this topic has been endorsed. Please refer to the attached document. Decision 022 - Paging.pdf

-JB-