ConsumerDataStandardsAustralia / standards-maintenance

This repository houses the interactions, consultations and work management to support the maintenance of baselined components of the Consumer Data Right API Standards and Information Security profile.
41 stars 9 forks source link

Add an unauthenticated GetDataHolderBrands endpoint exposed as a public API #444

Closed CDR-API-Stream closed 2 years ago

CDR-API-Stream commented 2 years ago

Description

As part of the API Uplift for Data Holder Multi-Sector support #424, there are considerations to be made on how the GetDataHolderBrands endpoint can be augmented to allow unauthenticated client access.

This piece of work has been extracted into its own change request to allow this topic to be considered beyond maintenance iteration 9.

Area Affected

Register APIs will need to be structured to expose GetDataHolderBrands information from a public API

Change Proposed

The following changes are proposed:

  1. Create a new GetDataHolderBrands endpoint returning the same payload in time for energy
  2. Move this endpoint out of the secure domain, removing MTLS and authentication requirements
  3. Move this endpoint into the public API domain, allowing the API to be accessible via CDN and leverage ETags
  4. Add industry filtering behaviour: • Default: All data holder brands will be returned • Query parameter filter: allow filtering by industry sector
  5. Remove paging, aligning the response definition to other public APIs

Options:

  1. Create an unauthenticated Data Holder Brands API returning the same content as the current authenticated API. The authenticated version of the API will be deprecated once a schedule has been determined

  2. Create an unauthenticated Data Holder Brands API returning a subset of the content currently returned in the authenticated API. Security endpoint information would NOT be included in this API. Both the authenticated and unauthenticated APIs would be maintained.

Considerations:

perlboy commented 2 years ago

If Option 1 is being evaluated there should also be a consideration for payload integrity verification.

CDR-API-Stream commented 2 years ago

During Maintenance Iteration 10 collaboration, it was identified that one of the key use-cases for having a public GetDataHolderBrands API is to allow public clients to discover data holders in the CDR and their respective product reference data, status, and outage endpoints.

Option 2 could be conceivably constrained in scope to target this use case.

ACCC-CDR commented 2 years ago

A significant use case for an unauthenticated endpoint to share Data Holder information is discovery of Product Reference Data (PRD) endpoints. DSB/Treasury have expressed a desire for each Energy data holder to implement their own endpoint for serving up PRD, just like the banking sector.

The publicBaseUri of the Data Holder Brand is used to formulate the PRD endpoint, for e.g. <publicBaseUri> + /cds-au/v1/banking/products. The only way to retrieve the publicBaseUri of the Data Holder Brand via the Register is to use the Get Data Holder Brands API. This is currently a secure endpoint, requiring mTLS and authentication, and can only be called by active and accredited data recipients. However, the PRD endpoints offered by data holder brands are unauthenticated, allowing anyone to retrieve product reference data regardless of their status in CDR.

The Get Data Holder Brands API returns information about data holder brands that have been made active in the CDR ecosystem. These brands have gone through the CDR on-boarding process, including completing CTS. The information about the data holder brand (legal entity, brand, endpoints) is collected through the on-boarding process.

The obligation to share PRD prior to consumer data sharing was in place for the banking sector and is also in place for the energy sector, and appears to be a pattern that will be used for each industry moving forward. Therefore, in order for the Register to be used to discover PRD endpoints, only a subset of information needs to be collected from data holders. The CDR on-boarding process is currently catering for consumer data sharing obligations, not PRD obligations.

The question then arises: how can PRD endpoints be made available via the Register (both prior to and after the consumer data sharing obligations of a data holder brand) to allow discovery and use by interested participants?

The ACCC proposes adding another API endpoint to the Register: Get Product Reference Data Endpoints. This endpoint will be unauthenticated. The URL for this endpoint will have a similar format to other Register API endpoints. Parameters and field names will be broadly comparable to those for Get Data Holder Brands.

This endpoint would not replace the Get Data Holder Brands API. The Get Data Holder Brands API would continue to be used by active data recipients to discover data holder brands that are available for consumer data sharing, whilst the PRD endpoints API can be used for product discovery and comparator purposes.

The proposed endpoint will return the following data:

Name

Type

Required

Description

brandName

String

true

Name of the brand that PRD is being served up for

brandId

String

false

Unique ID of the brand that PRD is being served up for

publicBaseUri

UriString

true

PRD base URL

logoUri

UriString

true

icon URL

industry

String

true

The industry the Data Holder Brand belongs to.

lastUpdated

DateTime

true

Time when the record was last updated

abn

String

false

Australian Business Number for the organisation

acn

String

false

Australian Company Number for the organisation

arbn

String

false

Australian Registered Body Number. ARBNs are issued to registrable Australian bodies and foreign companies

Options

  1. Implement the endpoint as described above. This is ACCC’s preferred option.

  2. Use a flat text file, similar to the datasources.json file used by the banking sector. This option is not preferred as it has no clear migration path and will require continuing manual maintenance.

  3. Remove authentication from GetDataHolderBrands. This option is not preferred as the semantics of GetDataHolderBrands would have to change to support PRD discovery. This entails unnecessary risk and unnecessary impact on participants.

Considerations

af-stayorgo commented 2 years ago

Thanks for taking the time to assess the issue and develop a proposed solution. Stay or Go is supportive of Option 1 proposed above - i.e. the create a new API, Get Product Reference Data Endpoints.

With regard to the brandId, this should be the same as the dataHolderBrandId in the existing Get Data Holder Brands API.

This has obviously been a significant gap and oversight for 2-3 years since the PRD endpoints were first implemented in the banking industry, so we request this change be ranked as an absolute priority for implementation Waiting until the end of the year, when Energy is to be implement, is not really acceptable in our mind.

ACCC-CDR commented 2 years ago

Given conversations during the maintenance iteration, the ACCC would like to make it clear that if it is not acceptable for brandId to be optional, this item will need to be delayed until a time that the ACCC can assess a solution where this would be achievable.

af-stayorgo commented 2 years ago

With regard to brandId and a delay, I think 'delay' and the trade-off needs to be defined. If the API can be delivered within the next month without the brandId, then that would be worth doing. If its going to be delivered in 6 months, then brandId should be in-scope.

If faced with the above choice, I'd recommend delivering V1 of the API in one month's time without the brandId and then add it to the API in 6 months time as V2.

Not having brandId does create significant complexity and operational risk for anyone consuming the data, so its not an acceptable long-term solution.

Not having the API at all make the PRD largely unusable, and certainly unreliable.

CDR-API-Stream commented 2 years ago

Collaboration on this issue through maintenance iteration 10, has resulted in the following DSB proposal, expanding off of @ACCC-CDR's initial proposal.

Rather than duplicate the GetDataHolderBrands API for unauthenticated clients, a new API optimised for discovery of public data holder APIs (PRD, status, and outage and any future public endpoints) will be defined named: Get Data Holder Brand Public Endpoints

The schema will be optimised for presentation of Data Holder Brands and discovery of associated public endpoints. It will therefore have the following fields:

Name Type Required Description
brandName String true Name of the data holder brand hosting the public endpoints
brandId String ~false~ true ID of the data holder brand hosting the public endpoints
publicBaseUri UriString true Public endpoints base URL
logoUri UriString true Data Holder Brand logo URL
~industry~ industries ~String~ [string] true The industries the Data Holder Brand belongs to. Please note that the CDR Register entity model is constrained to one industry per brand which is planned to be relaxed in the future.
lastUpdated DateTime true Time when the record was last updated
abn String false Australian Business Number for the organisation
acn String false Australian Company Number for the organisation
arbn String false Australian Registered Body Number. ARBNs are issued to registrable Australian bodies and foreign companies

The payload will be a shortened version of the data holder brands payload with a flat simplified structure.

The industries array will follow the same convention as GetDataHolderBrands https://consumerdatastandardsaustralia.github.io/standards/#tocSregisterdataholderbrand. Only one industry will initially be supported until multiple industry sectors per brand are supported on the CDR Register.

Discussion has occurred on whether the brandId can remain optional until the CDR Register is able to guarantee the field will be populated. The DSB does not seek to define an interim solution in the standards. This leaves the responsibility to the ACCC to coordinate the release of this API based on their release schedules. Any interim solution will also remain the responsibility of the ACCC.

CDR-API-Stream commented 2 years ago

Feedback we have received from the ACCC indicates they will not be able to specify a delivery date for the proposed solution and an interim solution will be required. As highlighted from ACCC's feedback, a mandatory brandId will not be available in the short to medium term.

Feedback from industry during MI 10 has highlighted that an interim solution that provides a mandatory unique identifier in addition to the brandId, would meet the need to reconcile each record until a complete solution can be delivered.

In order to give the ACCC flexibility in delivering the full solution, provide partial value to API clients, and to simplify the reading of the standards, the DSB has decided to propose an interim solution.

Proposal

Based on the feedback received from stakeholders, the DSB proposes the following:

  1. brandId moves from mandatory to optional
  2. To enable clients to reconcile data holder brand records from this API with those retrieved in previous requests, an interim identifier (interimId) is proposed. This will allow for clients to use the following logic: a. IF the record includes a brandId, use brandId as the record identifier ELSE use interimId as the record identifier b. If in a future request the brandId is populated, move the record identifier from interimId to brandId
  3. For each record, when brandId is finally populated, the interimId will be maintained for a minimum of 1 month prior to being retired.

This logic will provide a transition period, allowing clients to migrate records from referencing the interimId to the brandId in anticipation of brandId becoming the mandatory identifier. Once brandId becomes the defacto standard identifier, interimId can be phased out per record after the transition period of a minimum of 1 month has passed.


Interim Solution (V1) - targeted latest dependency date of 01st October 2022 Note that this solution is intended to be a temporary solution until the ACCC is able to deliver the complete solution.

Name Type Required X-Conditional Description
brandName String true   Name of the data holder brand hosting the public endpoints
brandId String false true ID of the data holder brand hosting the public endpoints.  To be used to uniquely identify the record and not to be reused
interimId String false true Interim ID of the data holder brand hosting the public endpoints. This is to be used to uniquely identify the record when brandId is not populated and is not to be reused
publicBaseUri UriString true   Public endpoints base URL
logoUri UriString true   Data Holder Brand logo URL
industries [string] true   The industries the Data Holder Brand belongs to. Please note that the CDR Register entity model is constrained to one industry per brand which is planned to be relaxed in the future.
lastUpdated DateTime true   Time when the record was last updated
abn String false   Australian Business Number for the organisation
acn String false   Australian Company Number for the organisation
arbn String false   Australian Registered Body Number. ARBNs are issued to registrable Australian bodies and foreign companies


Complete Solution (V2) - targeted release to be assessed in a future maintenance iteration

Name Type Required Description
brandName String true Name of the data holder brand hosting the public endpoints
brandId String true ID of the data holder brand hosting the public endpoints.  To be used to uniquely identify the record and not to be reused
publicBaseUri UriString true Public endpoints base URL
logoUri UriString true Data Holder Brand logo URL
industries [string] true The industries the Data Holder Brand belongs to. Please note that the CDR Register entity model is constrained to one industry per brand which is planned to be relaxed in the future.
lastUpdated DateTime true Time when the record was last updated
abn String false Australian Business Number for the organisation
acn String false Australian Company Number for the organisation
arbn String false Australian Registered Body Number. ARBNs are issued to registrable Australian bodies and foreign companies
af-stayorgo commented 2 years ago

This approach look fine. Key to success is to deliver the interim solution quickly. Thanks

ACCC-CDR commented 2 years ago

The ACCC reiterates it’s previous commentary that if the solution the ACCC has previously proposed is not accepted then we’ll be unable to commit to a 1st of October 2022 delivery date. The ACCC does not support the inclusion of an interimId for the same reason it does not support a mandatory brandId, the additional solution delivery and operational effort. If the ACCC’s proposed solution cannot be agreed upon, the ACCC suggests deferring this GitHub issue to the next Maintenance Iteration so that the alternative options that the ACCC proposed on the 23rd March can be discussed.

perlboy commented 2 years ago

The ACCC reiterates it’s previous commentary that if the solution the ACCC has previously proposed is not accepted then we’ll be unable to commit to a 1st of October 2022 delivery date.

This sounds a lot like hostage taking by the regulator, perhaps if the ACCC engaged the community earlier it wouldn't be facing timeline issues. October is 6 months away, that's the time accepted by all the other participants in the ecosystem, I find it pretty hypocritical for the ACCC to enforce FDO obligations, request shorter timelines on Participants and then not accept the convention for its own delivery.

The ACCC does not support the inclusion of an interimId for the same reason it does not support a mandatory brandId, the additional solution delivery and operational effort.

Is the ACCC honestly suggesting that these records don't already have a database identifier associated with them? If there isn't an identifier already present how else would the ACCC be administering the records? Hand coding them? Smoke signals from some tower in Canberra? I thought the interimId proposal was to allow for the id to change which seems to eliminate any architectural reason to oppose it.

If the ACCC’s proposed solution cannot be agreed upon, the ACCC suggests deferring this GitHub issue to the next Maintenance Iteration so that the alternative options that the ACCC proposed on the 23rd March can be discussed.

I was at the last maintenance call and I don't believe ACCC bothered to show up, perhaps it should have. If the ACCC is going to throw its toys out of the pram whenever it doesn't get what it wants then perhaps the Minister should be called in to resolve who exactly is responsible for Register related Standards since it seems increasingly ambiguous.

It isn't a reasonable nor sustainable solution for non-government technical implementers trying to plan activities based on guidance from multiple government bodies on delivery timelines that are, in the Register case, quite opaque.

af-stayorgo commented 2 years ago

Just want to highlight again that this API is already close to 3 years late - i.e. it should have been delivered on 1-Jul-2019 when the first PRD APIs were launched. Waiting another 6 months for this API really isn't an acceptable outcome for the industry or the success of the CDR.

This is a very simple API and shouldn't take more than a week to build. We therefore request that it be priorities and delivered within the next 4 weeks. Happy to have a conversation with the ACCC (assuming they are responsible for building the API) if technical advice is required.

Regarding the interimId, the suggestion above to use the ACCC's existing database identifier makes sense, assuming they have one. If the existing database has been implemented without a unique identifier (e.g. in an Excel spreadsheet), then the ABN should be a sufficient proxy - i.e. a company may rebrand, but they don't normally change their ABN. Otherwise, it would be very simple to add an 'interimId' the the existing database.

CDR-API-Stream commented 2 years ago

To summarise, the contention remained that the interimId presented increased complexity for ACCC delivery, conflicting with industry highlighting the urgency of delivery of this API with a mandatory identifier in order to be fit for purpose.

With this being considered, the DSB’s position remained unchanged as the interim solution shares the implementation burden between the CDR Register and clients until the complete solution is available. No other solution has been proposed which is seen as fit for purpose. The 1 October 2022 delivery date proposed is the latest that this API can be released and provide meaningful value to the CDR. Later delivery will erode the value of this API as industry may look for alternative sources for discovery. Earlier delivery will increase the value of this API as it will increase the period unauthenticated clients are able to leverage its value and increase confidence that the CDR Register is the source of truth in the CDR.

API Details
API Name Get Data Holder Brands Summary
Path GET /cdr-register/v1/{industry}/data-holders/brands/summary

Release v1.17.0. for the CDS incorporates the interim solution (V1) into the standards.

Issue #518 has been raised to track delivery of the Complete Solution (V2)

Please refer to Decision 237 for further details.