co-cddo / ukgov-metadata-exchange-model

A metadata model for describing data assets for exchanging between UK government organisations.
https://co-cddo.github.io/ukgov-metadata-exchange-model/
Other
13 stars 1 forks source link

Potential changes to metadata standards to support APIs #56

Closed asmith-nhsx closed 10 months ago

asmith-nhsx commented 11 months ago

As part of the work to migrate the APIs from api.gov.uk into the Data Marketplace, we need to consider what additional metadata might be useful to include specifically for APIs. Opening this issue to collect thoughts and proposals for consideration.

Some considerations:

AlasdairGray commented 11 months ago

APIs are already supported by the metadata exchange model. They are captured as the DataService type as per the DCAT standard.

With regard to your specific considerations:

If you have more specifics about any of these, please could you open individual issues for each property as it helps to keep the discussions focused on the specific property.

asmith-nhsx commented 11 months ago

APIs are already supported by the metadata exchange model. They are captured as the DataService type as per the DCAT standard.

I think this is the issue, DCAT is focussed on data services, which may not cover all API use cases, as in some APIs are purely transactional.

With regard to your specific considerations:

  • API types are captured with the serviceType property which has three permissible values EVENT/REST/SOAP (see ServiceTypeValues for details)

This list could do with extending, for example, what about RPC? Also Event APIs I would see as different to the protocol. The overall API type might be: Data, Event, Transactional, and then the protocol might be: REST, SOAP, HTTP, Kafka, etc.

  • API access controls: it would be helpful to have more detail on what you mean here and the use case that it support.

For OAS APIs and AsyncAPI events this could be taken from the security scheme type e.g. apiKey, http, mutualTLS, oauth2, openIdConnect etc. In addition we may want to capture the flavour of access controls (be them attribute, role based, claims, etc.)

  • API json schemas: this is already captured through the endpointDescription property which can be used to point to an OpenAPI or SOAP specification

Will a URI work in all cases if the APIs documentation is not public?

If you have more specifics about any of these, please could you open individual issues for each property as it helps to keep the discussions focused on the specific property.

I would prefer to initially keep this issue open as a discussion rather than track all issues separately.

asmith-nhsx commented 10 months ago

In the DCAT vocabulary there is a conformsTo property which may be more appropriate for defining if a service is SOAP, REST etc.

serviceType doesn't seem to have an equivalent, but seems to reference genre/type, which should probably just be Service for an API.

In terms of the API specification this would be referenced in the endpointDescription.

By implication APIs in the metadata are "data services", so there is no natural place to say whether an API is Data, Event or Transactional.

Could we define a supertype of dataService which captures the additional attributes? This could map to the service class?

I think the fundamental issue remains that not all our APIs are data services.

AlasdairGray commented 10 months ago

Can you give some examples of the services that you have in mind? It'll be easier to understand with some concrete examples.

asmith-nhsx commented 10 months ago

A Data API would likely be one primarily concerned with returning a persistent data resources to a consumer, for example: Cefas Data Portal APIs - All metadata records and published datasets which are published on the Cefas Data Portal are available via this API.

A transactional API would be primarily about recording a transaction with or submitting some data to a service, for example: Safety and Security Import Declarations API - This API allows users to create a new ENS submission and amend an existing ENS submission

An eventing API would be an asynchronous service for publishing or subscribing to a type of event, for example, NHS Digital Vaccination Events - FHIR, as it uses a publish subscribe mechanism to publish vaccination event information.

I think the thrust of the metadata standard is toward the first type as a means of delivering a dataset to a consumer. For the purposes of the data marketplace (DM) we need to consider all types of API. Therefore it makes sense for the DM to define additional attributes for APIs outwith the metadata standard. Given this would be DM specific it makes sense not to extend the metadata standard to accommodate this wider definition of APIs.

Closing this discussion therefore.

AlasdairGray commented 10 months ago

A Data API would likely be one primarily concerned with returning a persistent data resources to a consumer, for example: Cefas Data Portal APIs - All metadata records and published datasets which are published on the Cefas Data Portal are available via this API.

This is a portal of data assets. The contents of the portal would be described using the metadata exchange model. The portal itself could also be seen as a data service and its API described. It is likely that such an API would be REST based, as per the Data Marketplace prototype, and therefore modelled as a DataService with a serviceType value of REST.

A transactional API would be primarily about recording a transaction with or submitting some data to a service, for example: Safety and Security Import Declarations API - This API allows users to create a new ENS submission and amend an existing ENS submission

The example here is about the operation that you perform over the API. What we were trying to capture with the serviceType property was the protocol used. If I look into the links on that page, then they are mostly REST services.

An eventing API would be an asynchronous service for publishing or subscribing to a type of event, for example, NHS Digital Vaccination Events - FHIR, as it uses a publish subscribe mechanism to publish vaccination event information.

This would be captured by the EVENT value given to the serviceType field.

It would be possible to capture a finer grained taxonomy of service types, for example you could capture that FHIR is a type of EVENT service or similarly for the OGC geospatial services which are now mostly REST offerings.

For the first instance of the exchange model we decided to keep things simple and see whether it satisfied the use cases of the Data Marketplace. It was a tradeoff of expressivity against ease

I think the thrust of the metadata standard is toward the first type as a means of delivering a dataset to a consumer. For the purposes of the data marketplace (DM) we need to consider all types of API. Therefore it makes sense for the DM to define additional attributes for APIs outwith the metadata standard. Given this would be DM specific it makes sense not to extend the metadata standard to accommodate this wider definition of APIs.

I think that the metadata model is predominantly there to support the data marketplace, but it is a balancing act of expressivity vs the resource required by providers to supply that data vs the needs of consumers to be able to exploit the metadata.