ESIPFed / science-on-schema.org

science-on-schema.org - providing guidance for publishing schema.org as JSON-LD for the sciences
Apache License 2.0
114 stars 33 forks source link

Recommendation for consistency in `schema.org` namespace #52

Closed datadavev closed 4 years ago

datadavev commented 4 years ago

Goal is to provide a recommendation for consistency in how the schema.org vocabulary is identified.

Consistent use of a namespace reference resources such as the schema.org vocabulary is important for simplifying machine processing of markup. Several variants of the schema.org namespace have been seen in use, including: http://schema.org, https://schema.org, http://schema.org/, and https://schema.org/. All of these clearly refer to the same vocabulary, but all are treated as distinct by parsers.

Schema.org is somewhat ambivalent, with a slight emphasis on https.

Suggest writing up and adding to the CONVENTIONS.md document.

Recommendation will be for https://schema.org/

dr-shorthair commented 4 years ago

@danbri any guidance here?

mbjones commented 4 years ago

A recommendation would be great.  Full support to add this.

Given the focus on https on the web, I think we should recommend https, unless schema.org itself has declared the namespace as http.  Also, given that term URIs are constructed by appending the term to the schema context URI, I think its important to use the trailing slash (so that term URIs look like https://schema.org/Dataset and not https://schema.orgDataset).

danbri commented 4 years ago

http://schema.org/docs/faq.html#19

Note that the schema.org project itself uses (for data continuity) the http style internally within data files, but increasingly https is used in public markup

smrgeoinfo commented 4 years ago

Hmm, doesn't solve the problem. We need an authoritative recommendation on the string to use for identifying the schema.org namespace. who is the authority?

mbjones commented 4 years ago

Seems like the schema.org site is the authority. Based on the FAQ there that @danbri linked to, they say:

This is a lengthy way of saying that both 'https://schema.org' and 'http://schema.org' are fine.

So, we should be free to recommend either of those, and I still think that its best to recommend the https version given the general migration to https. But mayeb our tools should try to accept either. The FAQ doesn't speak to the trailing slash issue. Maybe I am misunderstanding what would happen without a trailing slash in the context URI.

lewismc commented 4 years ago

This is an important issue. Over in Any23 (and further down in the Semargl parser system) we have encountered issues due to absence of trailing slash. See https://issues.apache.org/jira/browse/ANY23-428 for details.

datadavev commented 4 years ago

A hacky way around is to normalize the namespace when loading the graph. It's not desirable but it does work, so far at least, for content in the wild. Example using python rdflib: https://so-tools.readthedocs.io/en/latest/sotools.common.html#sotools.common.loadSOGraph Follow the source link for details.

danbri commented 4 years ago

Having a normalisation step isn’t terribly hacky. With these kinds of representation you often have several ways of saying the same thing...

On Tue, 26 Nov 2019 at 13:38, Dave Vieglais notifications@github.com wrote:

A hacky way around is to normalize the namespace when loading the graph. It's not desirable but it does work, so far at least, for content in the wild. Example using python rdflib: https://so-tools.readthedocs.io/en/latest/sotools.common.html#sotools.common.loadSOGraph Follow the source link for details.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESIPFed/science-on-schema.org/issues/52?email_source=notifications&email_token=AABJSGNB5HXMV3ADSONQJLTQVUREPA5CNFSM4JQJNHU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFGAXXQ#issuecomment-558631902, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJSGIZ6FFJJTMDZPEKJTDQVUREPANCNFSM4JQJNHUQ .

mbjones commented 4 years ago

Discussed on today's call. Consensus was that we recommend using https and a trailing slash, and that the trailing slash would be required. As requested, I added a new issue #59 for a SHACL shape for validating the namespace. So, this would mean that we are recommending:

lewismc commented 4 years ago

+1

--

Lewis Dr. Lewis J. McGibbney Ph.D, B.Sc Skype: lewis.john.mcgibbney

mbjones commented 4 years ago

The CONVENTIONS.md was updated in PR #55 and the examples in guidelines were updated in PR #62, so this issue has been resolved. Closing.

datadavev commented 4 years ago

This issue needs to be reopened or the conclusion changed since the accepted resolution is contrary to the established namespace for schema.org which is the IRI http://schema.org/. Although the URI https://schema.org/ resolves to the same content as http://schema.org/, RDF processing rules dictate simple string comparison applies when matching IRI's, hence https and http designate different namespaces.

Schema.org uses http://schema.org/ as the namespace for programmatic resources, e.g.:

  1. https://schema.org/docs/jsonldcontext.jsonld
  2. https://schema.org/version/latest/schema.nt

(1) is the result of a request GET "accept: application/ld+json, application/json" http://schema.org/ which is used by some JSON-LD processors (e.g. jsonld.js).

(2) is a serialization of the schema.org vocabulary which can be used for example, when evaluating class inheritance such in SPARQL queries.

In both cases, use of the https prefix for schema.org content breaks downstream processing since it is not matched in the context or vocabulary. In a nutshell, http://schema.org/Dataset != https://schema.org/Dataset.

This change will have an impact on documentation and some deployed services, however it does appear to be necessary to ensure broader interoperability.

danbri commented 4 years ago

It may be time for us to look at moving schema.org's URIs to be https: based.

Thoughts?

On Thu, 21 May 2020 at 13:03, Dave Vieglais notifications@github.com wrote:

Reopened #52 https://github.com/ESIPFed/science-on-schema.org/issues/52.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESIPFed/science-on-schema.org/issues/52#event-3360408051, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJSGISFBZDZIQ6SINKJE3RSUKBVANCNFSM4JQJNHUQ .

datadavev commented 4 years ago

That would be great. How would legacy content be handled though, since this would be creating an entirely new namespace? Is there a way in RDF to specify that two namespaces are the same?

The fundamental problem of course is that the string http://schema.org/ is not equal to the string https://schema.org/. As a consequence these are treated as different IRIs by RDF processors which rely on the RDF specification asserting simple string comparison.

Now we can adjust our tooling to change the "http://schema.org/" to "https://schema.org/" any time a resource is loaded for processing, however besides being contrary to RDF processing rules, there are other problems with this approach:

  1. Additional processing overhead. In most cases this is a trivial annoyance.
  2. Third party libraries may load external resources for processing without providing an opportunity for translation. This is a big ~problem~ challenge. For example, in recent work using the jsonld.js library I was confused why framing was not working as expected until I learned that the library was loading the schema.org context from https://schema.org/docs/jsonldcontext.jsonld. These kinds of background processing are common in linked-data and it's not clear that a general solution is available.
mbjones commented 4 years ago

If schema.org published an equivalence schema, then it could be inferred. In RDF, providing owl:equivalentClass and owl:equivalentProperty would enable the co-existence of both namespaces, with a preference for the https one, and would enable inferences about the equivalence.

In addition, could a new jsonld context file at https://schema.org/ provide a similar mapping? Wouldn't it work to have the following context file at the https context location for the 'schemas' context?

{
  "@context": {
      "schema":"http://schema.org/",
      "schemas": "https://schema.org/",
      "Organization": {"@id": "schema:Organization"},
      "Person": {"@id": "schema:Person"},
       ...
    }
}

Or I suppose the opposite http->https mapping could be provided. We do something similar in CodeMeta: https://doi.org/10.5063/schema/codemeta-2.0

datadavev commented 4 years ago

Was hoping for something a bit lighter weight than an equivalence schema.

The context approach won't achieve a statement of equivalence since for example, schema:Person != schemas:Person.

A variant of that approach does work OK for some cases though. Basically, determine the namespace then frame using the context that corresponds to the namespace. The resulting object has the expected keys for working in Javascript world, though of course, they are still different in RDF world. This approach works well enough when for example, binding the json-ld to reactive components for UI rendering or for indexer ingest.

mbjones commented 4 years ago

The context approach won't achieve a statement of equivalence since for example, schema:Person != schemas:Person.

I'm not so sure about that. In codemeta, our mappings get incorporated and the property types show up as the proper schema.org types -- you can see the type equivalence by expanding and compacting the JSON-LD instance docs. Here's an example where the instance fields are typed as CodeMeta fields, but the type equivalence in the context file causes them to be interepreted as the corresponding schema.org type. Check out the resolved URIs for the various properties in the Table or N-Quads tab of this instance doc in JSON-LD Playground.

Note that I had to use a direct link to the codemeta context file because the playground doesn't support the 3rd party DOI redirect.

mbjones commented 4 years ago

@danbri I've noticed that the Google Structured Data Testing Tool does not make those same equivalence relations that can be seen being resolved in the JSON-LD Playground. Is that intentional?

danbri commented 4 years ago

SDTT was never intended as a generic utility for all JSON-LD content. It is oriented towards data that is used by Google, which generally means Schema.org

On Thu, 21 May 2020 at 20:47, Matt Jones notifications@github.com wrote:

@danbri https://github.com/danbri I've noticed that the Google Structured Data Testing Tool does not make those same equivalence relations that can be seen being resolved in the JSON-LD Playground. Is that intentional?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESIPFed/science-on-schema.org/issues/52#issuecomment-632307498, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJSGLLQAWY6QIFMZLVJT3RSWAMLANCNFSM4JQJNHUQ .

datadavev commented 4 years ago

@mbjones, does the provided example map between namespaces? Seems that it is applying the namespace identified in the referenced context, not translating or mapping from one to another. The challenge we have basically comes down to asserting that ./A and ./B are of the same class:

{
  "@context":{
    "schema":"http://schema.org/",
    "schemas":"https://schema.org/",
  },
  "@graph": [ 
    {
      "@id": "./A",
      "@type": "schema:Dataset"
    },
    {
      "@id": "./B",
      "@type": "schemas:Dataset"
    }
  ]
}

Which is kind of like this illegal construct:

{
  "@context":{
    "schema":"http://schema.org/",
    "schemas":"https://schema.org/",
    "Dataset": { 
      "@id": ["schema:Dataset", "schemas:Dataset"]   <--Nope.
     }
  },
  "@graph": [ 
    {
      "@id": "./A",
      "@type": "Dataset"
    },
    {
      "@id": "./B",
      "@type": "Dataset"
    }
  ]
}
mbjones commented 4 years ago

@datadavev The mapping example I gave differs significantly from yours. In the first case, you don't do any mapping between the predicates for A and B. In the second, you definitely can't assign two values to @id.

The example instance doc I gave in the playground references the CodeMeta json-ld context (and not the schema.org context). The codemeta context then has mappings to schema.org in the form that I listed above in https://github.com/ESIPFed/science-on-schema.org/issues/52#issuecomment-632264895 . When the playground expands all of that, it sees that, for example, that codemeta:contributor in the instance document is an instance of schema:contributor and assigns that schema-namespaced predicate in the expanded graph (e.g., _:b0 http://schema.org/contributor https://orcid.org/0000-0001-5135-5758). That happens despite not even defining schema.org's namespace in the instance document itself. For CodeMeta terms that don't have a mapping to a schema.org term, they maintain the codemeta namespace from the codemeta context (e.g., _:b0 https://codemeta.github.io/terms/maintainer https://orcid.org/0000-0002-1642-628X). I think the same approach could be used to map the http://schema.org. namespace to https://schema.org/, but I can't claim to really fundamentally understand how this works.

datadavev commented 4 years ago

@datadavev The mapping example I gave differs significantly from yours. In the first case, you don't do any mapping between the predicates for A and B. In the second, you definitely can't assign two values to @id.

The example was to illustrate the problem that is the subject of this issue, i.e. inconsistency in the use of schema.org namespace as used by two graphs. We want our tools to consider A and B to be the same kind of thing, but by definition they are different. The second example is of course invalid and incorrect because it would be ambiguous - and that is the crux of our challenge. There are a bunch of resources that identify as http://schema.org/ and similarly a bunch that use https://schema.org/ and we want to make those interchangeable or pick one namespace and stick with it.

The example instance doc I gave in the playground references the CodeMeta json-ld context (and not the schema.org context).

The instance doc references a remote context document that provides a number of term definitions in two namespaces, schema that expands to http://schema.org/ and codemeta for https://codemeta.github.io/terms/. It is correct that it does not directly reference the schema.org context document, but there is no need to since the namespace is referenced and terms are clearly defined to be in that namespace.

The codemeta context then has mappings to schema.org in the form that I listed above in #52 (comment) .

That context provides a number of term definitions, for example the term contributor is identified as the compact URI schema:contributor which expands to http://schema.org/contributor.

When the playground expands all of that, it sees that, for example, that codemeta:contributor in the instance document is an instance of schema:contributor and assigns that schema-namespaced predicate in the expanded graph (e.g., _:b0 http://schema.org/contributor https://orcid.org/0000-0001-5135-5758).

There is no definition for codemeta:contributor, nor is there any entry by that name in the instance document or elsewhere. There is only contributor which is an alias for schema:contributor.

That happens despite not even defining schema.org's namespace in the instance document itself. For CodeMeta terms that don't have a mapping to a schema.org term, they maintain the codemeta namespace from the codemeta context (e.g., _:b0 https://codemeta.github.io/terms/maintainer https://orcid.org/0000-0002-1642-628X).

The term maintainer is an alias for codemeta:maintainer which is stated in the context document.

The document does illustrate the challenge of consistency with the schema.org namespace however, since it does use http://schema.org/ which would make it incompatible with the Science-on-schema.org guidelines as they currently stand.

datadavev commented 4 years ago

[re-opening since I fat fingered the close button vs. comment]

fils commented 4 years ago

Forgive me if I find this fun to post here.... :)

https://github.com/schemaorg/schemaorg/issues/2597

datadavev commented 4 years ago

The proposal appears to be for URLs in examples and elsewhere to prefer https://schema.org/, not to alter the namespace.

fils commented 4 years ago

@datadavev no I agree and understand.. but I had to have some fun and post it to bring back up the whole http vs https thing. :) I'm just still depressed all our namespaces will have http in them. I fully understand the issues re: SHACL and other JSON-LD processing. Honestly I still feel we are kicking the can down the road.. but I gave up the fight here.

datadavev commented 4 years ago

Maybe there's some merit to promoting a protocol-relative scheme for namespaces? So for example: //science-on-schema.org/ns/

fils commented 4 years ago

I'll stop poking and agree that seems like an interesting discussion. Note we will also need to add in link headers for any science on schema context we do as well. I've going to work this into my code that I'll put forward for this. schema.org recently dropped support for content negotiation. They instead adopted a Link header, with rel="alternate" and type="application/ld+json".

Easy to support...

danbri commented 4 years ago

Currently the Schema.org ns looks like this: https://schema.org/docs/jsonldcontext.jsonld

We would like to understand the potential impact if it were edited from

    "@vocab": "http://schema.org/",

to

    "@vocab": "https://schema.org/",

i.e. if the https- form were used in derrived triples. It is already find to use https in the @ context declaration. For example, if we were to say that from 15 Jan 2021, we'll make that change and also use that form in the canonical version of our machine readable schemas, would that break things bigtime in your community? I'm open to having schema.org public howtos / mapping files / sparql construct queries or whatever might make the transition go smoothly for everyone.

datadavev commented 4 years ago

My understanding is that the consequences of changing the value of @vocab for schema.org (or any established namespace) would be far broader than this community, though this is an issue for which I would very much enjoy being wrong.

The fundamental problem is disharmony between comparison algorithms used in RDF (strictly literal, [1]) and that for dereferencing URIs [2] which is more functional in nature depending on the application.

Use of http vs https as the URL pointing to the context document should have no impact since the same document is returned. The URL pointing to the context document as used in a JSON-LD document has no bearing on the content of the context document (providing it resolves to the same content).

RDF stipulates "Two RDF URI references are equal if and only if they compare as equal, character by character, as Unicode strings." [1] Hence, changing the value of @vocab would break any processors that rely on comparison of terms since adding the "s" would change the base IRI.

For example, a processor would treat http://schema.org/Thing and https://schema.org/Thing as different types. This means that any content that uses http://schema.org/ as a base IRI would require special treatment to support comparison with https://schema.org/. The "special treatment" can be can be very simple for trivial cases, but can get quite complex when third party libraries and resources contribute to the processing pipeline.

Hence, my impression is that changing the value of @vocab for schema.org would be a very significant change requiring broad discussion and consensus prior to implementation. I imagine this topic has arisen for other communities and so I expect there to be extensive prior art for consideration.

[1] https://www.w3.org/TR/rdf-concepts/#section-Graph-URIref [2] https://tools.ietf.org/html/rfc3986#section-6

smrgeoinfo commented 4 years ago

Sadly, this revisits and highlights the original argument for using URNs, which don't have expectation that they dereference independently, recognizing that they are identifier strings (a bitstream) independent of the dereferencing service; the dereferencing service is a separate web location/resource that might change over time, for which there might be multiple endpoints, and these services might provide different (but semantically equivalent) representations based on content negotiation. The location of a JSON-LD context document (or xsd, or JSON schema) should be independent (but of course related to) of the namespace URI for the version of a vocabulary.

Can the schema.org namespace URI (hopefully versioned!!!) be separated from the web location (URL) of context docs, xsd, JSON schema, etc. that constrain implementations/realizations of that namespace? A namespace definition should specify how to access the implementation resources given the namespace URI. (just imagining a workable web architecture here... :) )

mbjones commented 4 years ago

@datadavev I started a new branch (feature_52_namespace_consistency) with an ADR to summarize our namespace decisions. I’m not entirely clear on your landing spot on this dave, so if you could edit the ADR with the full recommendation that would be helpful. Then I think we have to update all of the examples to follow that advice consistently. But first let's see if we have agreement on the ADR. See https://github.com/ESIPFed/science-on-schema.org/blob/feature_52_namespace_consistency/decisions/52-namespace-consistency.md

amoeba commented 4 years ago

My 2c on this is that we need to use the IRI that makes our documents "true". As of writing, this IRI is http://schema.org/ and no others because that's what's in the schema.org context file. My opinion comes from two things:

  1. @vocab in the JSON-LD spec is an IRI and not merely a locator and,
  2. During expansion, the value of @vocab gets smushed with the term to form the full IRI (matching @datadavev's assertion about the RDF spec w/r/t resource equivalence).

If schema.org decides to publish those terms with new IRIs, this all might change depending on how that's done. Happy to comment/edit on the ADR once others have had a chance at it.

datadavev commented 4 years ago

See also further discussions::

mbjones commented 4 years ago

Thanks, @datadavev . The final comment by @RichardWallis that closed https://github.com/schemaorg/schemaorg/issues/2018#issuecomment-653579061 indicates that the change was 'Implemented'. Do you know what that means? Does it mean they did the things outlined in https://github.com/schemaorg/schemaorg/issues/2018#issuecomment-407345451 ?

That thread was really helpful in that it clarified that the goal was indeed to eventually move the namespace to https based URIs for the SO term namespace. Does implemented 15 days ago mean that is now complete? Maybe @danbri or @RichardWallace would have sage advice for us on this decision for our guidelines.

I also am encouraged to learn from that thread that there are owl:sameAs statements in the relevant on-page RDFa, JSON-LD & RDF/XML, and dumps between the http and https URIs for each term. I hadn't noticed that. So, we should probably include in our recommendations on this that consumers and harvesters should import those equivalence relations and treat the http- and https-based terms as logically equivalent. This will be harder for processors that don't employ any kind of inferencing, but for many harvesters that do convert to RDF it should be fairly straightforward to equate the namespaces because they are all 1:1 mappings of URIs.

Bottom line: Issue https://github.com/schemaorg/schemaorg/issues/2018 is making me reconsider our discussion to recommend http over https for term URIs. If the SO community is about to make the namespace switch, we should be consistent with them in our guidelines for new content (notwithstanding all of the salient reservations you have raised and raised in related issues).

RichardWallis commented 4 years ago

@mbjones The comment 'implemented' I made when closing old issue https://github.com/schemaorg/schemaorg/issues/2018#issuecomment-653579061 was in reference to the specific requirement to include within the HTML of the term pages on the schema.org site a rel=canonical statement linking to the https version of the page (eg <link rel="canonical" href="https://schema.org/Book" />) when we moved to a fully https website.

As discussed in other issues (eg. https://github.com/schemaorg/schemaorg/issues/2516) SDO has yet to move to fully https within the vocabulary itself.

Consensus is that such a move should be made in two probable steps. 1) Change references within examples on SDO site to be https plus add an owl:sameAs, or similar, to the definition of all terms to aid inferencing. 2) Move the whole vocabulary to be https based (still with inferencing links to HTTP).

With the huge number of SDO implementations, documentation, and established practice (including misunderstandings of the difference between the URI of terms and the URLs of the site that describes them), you can understand that such moves should be made with care and consideration.

I have no timescale for these moves, but they are on the agenda. Monitoring issue https://github.com/schemaorg/schemaorg/issues/2516 is the best advice I can give at the moment. ~Richard

danbri commented 4 years ago

We will likely start referring to an https-based set of definitions as the primary, canonical, etc representation of Schema.org

It will be harmless to continue to have mappings to the http URIs but this seems like a good time to collect up information about places where mappings exist

Whether people canonicalize to http or https within particular datasets and implementations is (already) their own business. Converting schema release data definitions between the two is not complex. I would recommend canonicalising one way or the other rather than mixing both and hoping inference sorts it out.

Timescales: I would like to see us https-centric before 2021, alongside whatever supporting mechanisms the implementor community need

There will always be a mix of data in both styles. Canonicalising the http vs https prefix is just one of many kinds of cleanup needed to make use of diverse data

On Sat, 18 Jul 2020 at 08:25, Richard Wallis notifications@github.com wrote:

@mbjones https://github.com/mbjones The comment 'implemented' I made when closing old issue schemaorg/schemaorg#2018 (comment) https://github.com/schemaorg/schemaorg/issues/2018#issuecomment-653579061 was in reference to the specific requirement to include within the HTML of the term pages on the schema.org site a rel=canonical statement linking to the https version of the page (eg ) when we moved to a fully https website.

As discussed in other issues (eg. schemaorg/schemaorg#2516 https://github.com/schemaorg/schemaorg/issues/2516) SDO has yet to move to fully https within the vocabulary itself.

Consensus is that such a move should be made in two probable steps. 1) Change references within examples on SDO site to be https plus add an owl:sameAs, or similar, to the definition of all terms to aid inferencing. 2) Move the whole vocabulary to be https based (still with inferencing links to HTTP).

With the huge number of SDO implementations, documentation, and established practice (including misunderstandings of the difference between the URI of terms and the URLs of the site that describes them), you can understand that such moves should be made with care and consideration.

I have no timescale for these moves, but they are on the agenda. Monitoring issue schemaorg/schemaorg#2516 https://github.com/schemaorg/schemaorg/issues/2516 is the best advice I can give at the moment.

~Richard

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESIPFed/science-on-schema.org/issues/52#issuecomment-660442733, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJSGLD5YENUQBQYCADM73R4FE6JANCNFSM4JQJNHUQ .

datadavev commented 4 years ago

Looks like SO has started exposing contexts with a choice of either https or http, see: https://schema.org/docs/developers.html.

For example: https namespace: https://schema.org/version/latest/schemaorg-current-https.jsonld

http namespace: https://schema.org/version/latest/schemaorg-current-http.jsonld

Consumers now need to handle mapping between the two schemes, since of course http://schema.org/Dataset is not the same as https://schema.org/Dataset. It's not immediately apparent if there is a mapping between the two namespaces, even though it should be obvious that http:// should be interpreted the same as https:// for term IRIs.

It will be interesting to see how long confusion between the namespaces will continue - hopefully will settle quickly.

I expect this means this issue can be closed and the recommendation will be for a namespace of https://schema.org/ for all new content, and where possible, existing content should be updated if possible.

RichardWallis commented 4 years ago

We are working on it - watch this space.

~Richard

On Tue, 21 Jul 2020 at 17:46, Dave Vieglais notifications@github.com wrote:

Looks like SO has started exposing contexts with a choice of either https or http, see: https://schema.org/docs/developers.html.

For example: https namespace: https://schema.org/version/latest/schemaorg-current-https.jsonld

http namespace: https://schema.org/version/latest/schemaorg-current-http.jsonld

Consumers now need to handle mapping between the two schemes, since of course http://schema.org/Dataset is not the same as https://schema.org/Dataset. It's not immediately apparent if there is a mapping between the two namespaces, even though it should be obvious that http:// should be interpreted the same as https:// for term IRIs.

It will be interesting to see how long confusion between the namespaces will continue - hopefully will settle quickly.

I expect this means this issue can be closed and the recommendation will be for a namespace of https://schema.org/ for all new content, and where possible, existing content should be updated if possible.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESIPFed/science-on-schema.org/issues/52#issuecomment-661975542, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFS2TXHVC2WHRYWKOABULLR4XBABANCNFSM4JQJNHUQ .

danbri commented 4 years ago

To add to Richard's comment: this is just a foundation. It is not directly about JSON-LD contexts, or mappings, just the basic convenience of having the otherwise identical definitions be available in both http and https flavours.

Note that http and https -flavoured schema.org triples are very much not new. The Schema.org FAQ has actively encouraged sites to use https if they prefer it, for 5 years or so. All serious usage of Schema.org from public web will already be dealing with this trivial difference (fixable with one search/replace).

By far the larger and harder problem is not this kind of variation, where the two flavours are otherwise identical except for a few "s" characters, but rather the case where subtly different graph patterns - shapes - make different legitimate usage of the same underlying terms. You can expect to see more activity around ShEx and SHACL for those issues too....

On Tue, 21 Jul 2020 at 18:29, Richard Wallis notifications@github.com wrote:

We are working on it - watch this space.

~Richard

On Tue, 21 Jul 2020 at 17:46, Dave Vieglais notifications@github.com wrote:

Looks like SO has started exposing contexts with a choice of either https or http, see: https://schema.org/docs/developers.html.

For example: https namespace: https://schema.org/version/latest/schemaorg-current-https.jsonld

http namespace: https://schema.org/version/latest/schemaorg-current-http.jsonld

Consumers now need to handle mapping between the two schemes, since of course http://schema.org/Dataset is not the same as https://schema.org/Dataset. It's not immediately apparent if there is a mapping between the two namespaces, even though it should be obvious that http:// should be interpreted the same as https:// for term IRIs.

It will be interesting to see how long confusion between the namespaces will continue - hopefully will settle quickly.

I expect this means this issue can be closed and the recommendation will be for a namespace of https://schema.org/ for all new content, and where possible, existing content should be updated if possible.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/ESIPFed/science-on-schema.org/issues/52#issuecomment-661975542 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADFS2TXHVC2WHRYWKOABULLR4XBABANCNFSM4JQJNHUQ

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESIPFed/science-on-schema.org/issues/52#issuecomment-661999586, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJSGMZT5UGAFLHMUJQHFDR4XF6LANCNFSM4JQJNHUQ .

datadavev commented 4 years ago

Thanks for the updates and clarification on the forward path for schema.org.

My reading of the FAQ was less prescriptive, and interpreted as suggesting https for referencing resources such as the context document and documentation. The definition of terms though has always been with http://schema.org/. This was apparent in the examples, markup in the wild, and the context documents.

Alignment of the namespace and the resource locations is a welcome clarification. There may be a few short term implementation challenges, though should be better in the long run.

The situation seems clear for this group, so I'm closing this issue with the resolution to identify schema.org terms with the https://schema.org/ prefix.

Consumers should consider https://schema.org/ and http://schema.org/ to be practically equivalent. Producers should be encouraged to use https://schema.org/ and update existing resources where feasible.

danbri commented 4 years ago

Thanks @datadavev - that seems like a good summary. Please keep us posted on the progress of this work (e.g. a mail to https://lists.w3.org/Archives/Public/public-schemaorg/ from time to time)