openactive / dataset-api-discovery

OpenActive Dataset API Discovery Specification
0 stars 0 forks source link

`publisher` vs `creator` vs ? / certificates and feature lists #11

Open nickevansuk opened 4 years ago

nickevansuk commented 4 years ago

Problem

We need to align how we're representing organizations that are related to a dataset with Google's expectations for the purposes of SEO.

The current draft data model for Dataset API Discovery includes:

  "publisher": {
    "@type": "Organization",
    "name": "Fusion Lifestyle",
    "description": "Fusion Lifestyle was established in April 2000 ...",
    "email": "info@fusion-lifestyle.com",
    "legalName": "Fusion Lifestyle",
    "logo": {
      "@type": "ImageObject",
      "url": "https://res.cloudinary.com/gladstone/image/upload/fusion-lifestyle-live/ydokan4mlia7zigqd79d"
    },
    "url": "https://www.fusion-lifestyle.com/"
  },
  "bookingService": {
    "@type": "BookingService",
    "name": "Gladstone360",
    "url": "https://www.gladstonesoftware.co.uk",
    "softwareVersion": "3.0.2"
  },

bookingService is a property in the OpenActive namespace, and is not recognised by Google. It is also not clear whether the publisher property is being used correctly in this context.

Considerations

Google's Structured Data Documentation

Google's Structured Data Documentation recommends the use of the property creator to represent the "The creator or author of this dataset", and does not provide specific references for other properties (though points to schema.org for more information).

schema.org

schema.org includes several options for attibution of the roles of organizations relating to a schema:Dataset:

Existing OpenActive dataset sites

The dataset site text reads:

This data is owned by <a href="{{publisher.url}}">{{publisher.legalName}}</a> and is licensed under the Creative Commons Attribution Licence (CC-BY v4.0) for anyone to access, use and share; using attribution "<a href="{{url}}"><span>{{publisher.name}}</span></a>".

Platform: <a href="{{bookingService.url}}">{{bookingService.name}} {{bookingService.softwareVersion}}</a>.

Note that single database systems generally set bookingService to match publisher, as they are the same.

Proposal

Note this proposal doesn't consider maintainer, which could be useful to include within the JSON-LD as a duplicate of creator (if set)?

Multiple database systems

For multiple database systems, where there is one dataset site per activity provider:

creator - The activity provider publisher - The booking system

This data is owned by <a href="{{creator.url}}">{{creator.legalName}}</a> and is licensed under the Creative Commons Attribution Licence (CC-BY v4.0) for anyone to access, use and share; using attribution "<a href="{{url}}"><span>{{creator.name}}</span></a>".

Platform: <a href="{{publisher.url}}">{{publisher.name}} {{publisher.softwareVersion}}</a>.

Single database systems

For single database systems, where there is one dataset site that contains data from multiple activity providers:

publisher - The booking system

This data is owned by <a href="{{publisher.url}}">{{publisher.legalName}}</a> and is licensed under the Creative Commons Attribution Licence (CC-BY v4.0) for anyone to access, use and share; using attribution "<a href="{{url}}"><span>{{publisher.name}}</span></a>".

(Note in this proposal the "Platform" reference is removed from the HTML for Single database systems)

Implementation note

We need to ensure the embedded DCAT markup reflects this change.

nickevansuk commented 4 years ago

Just highlighting an inconsistency with the above proposal:

One advantage of"bookingService" is that it is the name of the product, rather than the organisation. For example Gladstone Ltd have a number of products, including Gladstone360 and GladstoneOne, and their values for bookingService would be as follows:

  "bookingService": {
    "@type": "BookingService",
    "name": "Gladstone360",
    "softwareVersion": "2.0",
    "url": "https://www.gladstonesoftware.co.uk/platform-overview"
  },
  "bookingService": {
    "@type": "BookingService",
    "name": "GladstoneOne",
    "softwareVersion": "2.0",
    "url": "https://www.gladstonesoftware.co.uk/single-site"
  },

This also works well in terms of any conformance certificate - as it is necessarily the product that gets the conformance certificate, rather than the organisation.

Additionally the publisher is not the bookingService, or even the organisation that owns the bookingService, in the case of Gladstone above - the publisher is the actual leisure operator.

On reflection, perhaps it's best that we stick with the original proposal (which just uses publisher as whichever or is responsible for publishing - which might be the booking system owner, or the activity provider, depending on the technology in use), and keep bookingService as the technology being used to publish (and for which schema.org does not currently include a term).

Additionally perhaps BookingService should subclass schema:Product, to make it clear that this relates to the product rather than the organisation

thill-odi commented 3 years ago

Discussed on W3C call of 2020-09-09. Agree to retain current implementation.

nickevansuk commented 3 years ago

There's conflicting language being used here: BookingService is a product? Should we use a better name here?

thill-odi commented 3 years ago

Actually, aren't we really talking about a schema.org SoftwareApplication?

I would then suggest in relation to #15 that what we really want is FeatureList, which has the description "Features or modules provided by this application (and possibly required by other applications)", and is typed as Text or URL.

nickevansuk commented 3 years ago

Yes, great idea! Additionally WebApplication is a subclass of SoftwareApplication, so this should work even when the application is entirely web-based (e.g. "Playwaze").

I did look at FeatureList, however it doesn't reference a schema.org type, so doesn't allow for structured data within it...

thill-odi commented 3 years ago

Sorry, was reflecting on conformsTo and edited the above while you were posting. I think we really want featureList, as (now) noted above.

nickevansuk commented 3 years ago

Ha, no worries, I edited mine to match :)

thill-odi commented 3 years ago

Sorry, wasn't clear. I was referring to featureList in relation to #15. So, something like:

{
     "@type" : "SoftwareApplication",
     "name": "GladstoneOne",
     "softwareVersion": "2.0",
     "url": "https://www.gladstonesoftware.co.uk/single-site",
     "featureList": [
        ""https://openactive.io/openactive-test-suite/example-output/controlled/certification/"
     ]
}
thill-odi commented 3 years ago

Formalised in 0.2 as in the example immediately above. For discussion.

nickevansuk commented 3 years ago

The discussion in the W3C Community Group call on 13 January 2021, taking inspiration from the WebAPI discussions, concluded that one evolution of the above could be to use CreativeWork (or a subclass of CreativeWork) within featureList.

Looking at this further, and given that it is difficult for us to add to the range to existing schema.org properties (we avoid doing so where possible), perhaps defining a new property within SoftwareApplication would be the best approach?

Using the pattern of softwareHelp, which has a range of CreativeWork, perhaps this could look like:

"bookingService": {
     "@type" : "SoftwareApplication",
     "name": "GladstoneOne",
     "softwareVersion": "2.0",
     "url": "https://www.gladstonesoftware.co.uk/single-site",
     "hasConformanceCertificate": [
       {
         "@type": "CreativeWork",
         "encodingFormat": "application/vnd.openactive.certification+json",
         "url": "https://openactive.io/openactive-test-suite/example-output/controlled/certification/"
       }
     ]
}

Advantages:

Disadvantages:

nickevansuk commented 1 year ago

Another comparison for the above: hasConformanceCertificate is similar to hasCredential, which already has a range of EducationalOccupationalCredential (which in turn subclasses CreativeWork)