schemaorg / suggestions-questions-brainstorming

Suggestions, questions, and brainstorming
19 stars 15 forks source link

Proposal: move additionalProperty up to Thing attribute #117

Open noirbizarre opened 6 years ago

noirbizarre commented 6 years ago

Right now, the only mecanism allowing to extra custom properties is additionalProperty.

There is a lot of issues advising to use this attribute to declare extra custom properties but, sadly, it's only available on a very small subset of classes:

So the proposal is to make additionalProperty a Thing property because this property is non-specific, not domain related.

This will allow every child class to benefit from it (in my case I need it in particular on Dataset and DataDownload).

thadguidry commented 6 years ago

Describe what you need on Dataset Describe what you need on DataDownload

Open separate issues for those 2 needs, so that the whole community can help you further.

noirbizarre commented 6 years ago

I volontary proposed this change on Thing and not Dataset or DataDownload, because this is not the first time I tried to use it on something else. In fact, I need it on Organization and Person too. I already needed it on other classes too.

Thing is the only common abstract ancestor which is already holding all generic properties. This is why it seems logical (to me) that additionalProperty is a Thing property.

(I also have some specifics on Dataset and DataDownload and will submit 2 other separates issues).

mfhepp commented 6 years ago

This was also the original design when I initially submitted the additionalProperty and PropertyValue model. Back then there was some fear that having this at the level of Thing might foster abuse in lieu of properly defined language elements. I still support widening the use of additionalProperty.

Best, Martin


martin hepp www: http://www.heppnetz.de/ email: mhepp@computer.org

Am 30.08.2018 um 20:46 schrieb Axel Haustant notifications@github.com:

I volontary proposed this change on Thing and not Dataset or DataDownload, because this is not the first time I tried to use it on something else. In fact, I need it on Organization and Person too. I already needed it on other classes too.

Thing is the only common abstract ancestor which is already holding all generic properties. This is why it seems logical (to me) that additionalProperty is a Thing property.

(I also have some specifics on Dataset and DataDownload and will submit 2 other separates issues).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

Download commented 6 years ago

What I was pretty amazed at when first learning about schema.org / JSON-LD is how restrictive the schema is towards custom / unknown properties. Why do we even need additionalProperty?

There is additionalType which makes some sense. You create additional schemas to support custom properties. Create a more specific type that has your custom properties, specify it in a schema and then use additionalType to add those properties to whichever object. However, the problem here is the overhead it gives. Creating a custom schema is a lot of work.

Why not just ignore unknown fields??

I understand it could give collisions later on when fields with the same name are added to the spec, but that could be easily solved with a naming convention (e.g. custom property names should start with an underscore).

Imho, the way it currently works, makes schema.org much less flexible a tool for many purposes. For example I investigated using schema.org entity types as the model for a web api, but the problem I run into is that adding custom properties is either ugly / impossible (as this issue demonstrates) via additionalProperty or just way to much work (via additionalType). I would like it MUCH more if the default behavior would just be to ignore unknown fields.

danbri commented 6 years ago

To be clear, additional property is a pretty weak approach in the sense that it handles very well the case where we don't really know what the properties are. This is common in ecommerce, where a site might have foo=bar, xyz=1000, or for geo/place, things like horse=true.

Aside from these you can always mix in other vocabularies, regardless of whether they are presented as "extensions" (like gs1.org/voc) to Schema.org or not. Whether search engine features make much use of these is beyond schema.org's remit, but the data structure is perfectly sensible and useful. So as @mfhepp says, we shouldn't encourage over-use of additionalProperty when normal properties from nearby vocabularies can be used instead.

noirbizarre commented 6 years ago

I think using additionalProperty won't encourage schema-less attributes because if I use schema.org/JSON-LD this is because I want structured data.

I don't see additionalProperty as a replacement to strong typed properties but as complementary solution:

I think would be easier to detect a recurrent additionalProperty and submit a model evolution proposal.

Plus, I've been looking into issues, and a recurrent response is: "You should use additionalProperty for this but this is not available on this class":

danbri commented 6 years ago

Ping @rvguha - what do you think?

VladimirAlexiev commented 5 years ago

I second this proposal. Furthermore, https://schema.org/additionalProperty claims it applies to QualitativeValue and QuantitativeValue. But none of the examples shows such use.

vholland commented 4 years ago

Are there remaining issues to discuss here?

mfhepp commented 4 years ago

No, I think we can simply go forward.

As I was the original designer of this proposal, I would like to explain the motivation a bit:

There are many cases where publishers of data have identifiers for clearly defined properties. This can be either properties applied to all products in a shop, or properties from external standards like eCl@ss.

For those properties, we at least have a site-specific unique name or even a globally unique identifier from the external standard.

additionalProperty allows publishers to expose as much data semantics as they reasonably can without imposing a lot of additional barriers. This is better than a black or white approach where the property is either in schema.org or not supported.

The full rationale is described here: https://www.w3.org/wiki/WebSchemas/PropertyValuePairs

rvguha commented 4 years ago

I would prefer to hold off on such a radical move.

On Fri, Nov 15, 2019 at 2:49 AM Martin Hepp notifications@github.com wrote:

No, I think we can simply go forward.

As I was the original designer of this proposal, I would like to explain the motivation a bit:

There are many cases where publishers of data have identifiers for clearly defined properties. This can be either properties applied to all products in a shop, or properties from external standards like eCl@ss.

For those properties, we at least have a site-specific unique name or even a globally unique identifier from the external standard.

additionalProperty allows publishers to expose as much data semantics as they reasonably can without imposing a lot of additional barriers. This is better than a black or white approach where the property is either in schema.org or not supported.

The full rationale is described here: https://www.w3.org/wiki/WebSchemas/PropertyValuePairs

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/schemaorg/schemaorg/issues/2047?email_source=notifications&email_token=ABICKCRMDRDS3LTWXGEW24TQTZ5FNA5CNFSM4FSP6EQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEFCE6Q#issuecomment-554312314, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABICKCTDFCBICFVBEMEB2N3QTZ5FNANCNFSM4FSP6EQQ .

danbri commented 4 years ago

Agree with @rvguha here.

Schema.org is not a closed system, you can always add additional properties from other vocabularies when they're missing in schema.org.

It would be better to encourage that practice than to add an extra layer of indirection. The case of Product is reasonably special, aggregators do have useful property/value pairs that they barely understand. But we added a health warning into PropertyValue for good reason...

Always use specific schema.org properties when a) they exist and b) you can populate them. Using PropertyValue as a substitute will typically not trigger the same effect as using the original, specific property.

vholland commented 4 years ago

To advocate for this, there is some benefit to allowing developers to use https://schema.org/additionalProperty for experimenting before they get the schema correct.

As examples, there was some discussion in issue schemaorg/schemaorg#561 about the language at the other end of https://schema.org/EntryPoint. I would not say we have an answer yet, but using additionalProperty in WatchAction links has allowed some experimentation in the sorts of parameters a data reader might want. I still hope we can codify a good pattern for structuring this data.

Similarly, Google once used additionalProperty to understand JobPostings but once the space was clearer, worked with the community to adopt https://schema.org/jobLocationType.

I would not expect any reader to understand all uses of additionalProperty, but there are times like @mfhepp suggests where some data is better than none, particularly when exploring areas where authors have not previously provided structured data.

rvguha commented 4 years ago

I am not a fan of PropertyValue. It is basically an escape mechanism out of using schema.org vocabulary, while claiming to be using schema.org. And in doing so, it changes the structure of the graph.

If schema.org does not have the vocabulary things are much cleaner if you created new vocabulary in whatever namespace and used it.

guha

On Fri, Nov 15, 2019 at 12:23 PM Vicki Tardif notifications@github.com wrote:

To advocate for this, there is some benefit to allowing developers to use https://schema.org/additionalProperty for experimenting before they get the schema correct.

As examples, there was some discussion in issue schemaorg/schemaorg#561 https://github.com/schemaorg/schemaorg/issues/561 about the language at the other end of https://schema.org/EntryPoint. I would not say we have an answer yet, but using additionalProperty in WatchAction links https://developers.google.com/actions/media/reference/feed-examples/watch-actions-examples#additionalproperty has allowed some experimentation in the sorts of parameters a data reader might want. I still hope we can codify a good pattern for structuring this data.

Similarly, Google once used additionalProperty to understand JobPostings https://developers.google.com/search/docs/data-types/job-posting but once the space was clearer, worked with the community to adopt https://schema.org/jobLocationType.

I would not expect any reader to understand all uses of additionalProperty, but there are times like @mfhepp https://github.com/mfhepp suggests where some data is better than none, particularly when exploring areas where authors have not previously provided structured data.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/schemaorg/schemaorg/issues/2047?email_source=notifications&email_token=ABICKCT2445B43MKDTJXODDQT4AMBA5CNFSM4FSP6EQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGTOAA#issuecomment-554514176, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABICKCVMIURK4KSUZPWLCOTQT4AMBANCNFSM4FSP6EQQ .

danbri commented 4 years ago

While I share that concern, there are circumstances (like long-tail ecommerce properties) where it has value. Generally wherever the party creating the markup (e.g. wordpress extension author, seo) is not the party responsible for - and knowledgeable about - the underlying set of properties, or where they wouldn't really know how to "create new vocabulary in whatever namespace" anyway.

Maybe a "free for all" chaotic namespace like http://anything.example.org/ would allow this, for publishers who don't want to set up one for their own specific property set, e.g. in JSON-LD:

{ "@context": ["https://schema.org/", "http://anything.example.org/"], "@type": "Person", "name": "Alice", "foo": "bar" }

.. but doing this (following the json-ld context rules) would make it difficult to know whether "foo" was a typo of a term in schema.org that should be challenged in validation, or a term from the "anything goes" namespace.

On Sun, 17 Nov 2019 at 02:20, R.V.Guha notifications@github.com wrote:

I am not a fan of PropertyValue. It is basically an escape mechanism out of using schema.org vocabulary, while claiming to be using schema.org. And in doing so, it changes the structure of the graph.

If schema.org does not have the vocabulary things are much cleaner if you created new vocabulary in whatever namespace and used it.

guha

On Fri, Nov 15, 2019 at 12:23 PM Vicki Tardif notifications@github.com wrote:

To advocate for this, there is some benefit to allowing developers to use https://schema.org/additionalProperty for experimenting before they get the schema correct.

As examples, there was some discussion in issue schemaorg/schemaorg#561 https://github.com/schemaorg/schemaorg/issues/561 about the language at the other end of https://schema.org/EntryPoint. I would not say we have an answer yet, but using additionalProperty in WatchAction links < https://developers.google.com/actions/media/reference/feed-examples/watch-actions-examples#additionalproperty

has allowed some experimentation in the sorts of parameters a data reader might want. I still hope we can codify a good pattern for structuring this data.

Similarly, Google once used additionalProperty to understand JobPostings https://developers.google.com/search/docs/data-types/job-posting but once the space was clearer, worked with the community to adopt https://schema.org/jobLocationType.

I would not expect any reader to understand all uses of additionalProperty, but there are times like @mfhepp https://github.com/mfhepp suggests where some data is better than none, particularly when exploring areas where authors have not previously provided structured data.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/schemaorg/schemaorg/issues/2047?email_source=notifications&email_token=ABICKCT2445B43MKDTJXODDQT4AMBA5CNFSM4FSP6EQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGTOAA#issuecomment-554514176 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABICKCVMIURK4KSUZPWLCOTQT4AMBANCNFSM4FSP6EQQ

.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/schemaorg/schemaorg/issues/2047?email_source=notifications&email_token=AABJSGNLHI6EXECNZWVKJFDQUCS6HA5CNFSM4FSP6EQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEH7U6I#issuecomment-554695289, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJSGIYPEZMD2CW3NCTDUTQUCS6HANCNFSM4FSP6EQQ .

RichardWallis commented 4 years ago

See issue #7 for the context of the move from the main Schema.org issue tracker to this repository.