json-ld / json-ld.org

JSON for Linked Data's documentation and playground site
https://json-ld.org/
Other
858 stars 152 forks source link

Update to https://schema.org. #750

Closed gkellogg closed 2 years ago

gkellogg commented 3 years ago

Update most contexts, examples and playground entries to use https://schema.org instead of http://schema.org.

Fixes #422.

davidlehn commented 3 years ago

I'm confused by what should be done by all this.

This patch is taking the approach of changing all identifiers to https://schema.org/. But with the current context the site points to, the http://schema.org identifiers would be used. This is going to cause confusion. I think changing any context references to https would be good. But until they change the context data they serve to have https identifiers, I think we have to keep them as http.

Additionally, this is changing all the examples from the spec? That may cause confusion if anyone is comparing them. Was the plan was to update the specs too?

Very sad they didn't just do everything as https to begin with.

In any case, in theory the playground should be able to load the http context since the document loader rewrites that common URL specifically.

gkellogg commented 3 years ago

They are in the process of moving context from http://schema.org/ to https://schema.org/. I think they did for v12, but pulled back due to some transitional issues. Both the HTTP and HTTPS versions are valid, and in the future will move to HTTPS exclusively. I think we could do this at any time, but we could hold off until they have completed the move to an HTTPS context.

davidlehn commented 3 years ago

I think the context is working over https now? I only see redirects from http to https. But the context defines everything as http, like http://schema/name. Are you saying they are going to switch all IRIs in the context and vocab to be https? Or just change the context URL itself?

Since they make a https-everything vocab available, I was wondering if they were transitioning it all. If so, I'm curious how applications are supposed to support both ids at once. Changing old vocabs and data is a challenge and full reasoner tools are not the norm for lots of use cases.

gkellogg commented 3 years ago

Are you saying they are going to switch all IRIs in the context and vocab to be https? Or just change the context URL itself?

The play, IIRC, is to make https://schema.org/ the norm and to switch over the context accordingly. It will continue to be valid, either way. But, shouldn't mix HTTP and HTTPS urls in the same document. You'll continue to be able to retrieve the context either way, but I think that is where some got messed up.

I support both vocabularies by maintaining two different vocabularies! "schema" and "schemas" prefixes.

davidlehn commented 3 years ago

@gkellogg Are we talking past each other here? I'll try again.

Example:

{
  "ex:1": {
    "@context": "https://schema.org/",
    "name": "Ex 1"
  },
  "ex:2": {
    "@context": {
      "name": "https://schema.org/name"
    },
    "name": "Ex 2"
  }
}

At first glance one would think those are the same, but it currently expands to:

[
  {
    "ex:1": [
      {
        "http://schema.org/name": [
          {
            "@value": "Ex 1"
          }
        ]
      }
    ],
    "ex:2": [
      {
        "https://schema.org/name": [
          {
            "@value": "Ex 2"
          }
        ]
      }
    ]
  }
]

If you recompact with {"@context": "https://schema.org/"} you get:

{
  "@context": "https://schema.org/",
  "ex:1": {
    "name": "Ex 1"
  },
  "ex:2": {
    "https://schema.org/name": "Ex 2"
  }
}

This sort of behavior is correct, but will be confusing to anyone trying to mix the different properties. Anyone expanding data and trying to process it will have a terrible time dealing with duplicate http AND https properties for everything. It seems like changing the schema.org property ids to https is just a potential technical disaster. All expanded JSON-LD out there currently uses http://schema.org/xxx ids for everything.

My suggestion would be to change any @context URLs to https://schema.org/ to make fetches more secure. However, leave all the properties with a http://schema.org/ prefix that the current context is using. This will also probably confuse people, but hopefully in a different and less confusing way?

gkellogg commented 3 years ago

I don’t think we’re talking past each other. I believe they are going to change the contest to use https://schema.org as the IRI prefix pervasively, and did originally for the v12 release, but rolled it back within the context. My thought was that we just pin the PR until they do.

Actually, many examples on the schema.org site are now using https://schema.org prefixes. Either are okay with them. The linter accepts either, but requires the namespaces to be used consistently.

Of course, we could update existing contexts to point there, but you’ve already don this effectively for the playground. Most of our other examples don’t actually reference the context, just use the namespace.

Of course, there could be a middle ground.

nicholascar commented 3 years ago

Just a follow-up here: schema.org is using HTTPS for everything - website, vocab etc. In RDFlib we have converted our namespaces to HTTPS and I'd like to see all references to schema.org updated to HTTPS to reduce errors/conflicts.

gkellogg commented 2 years ago

Obsolete now.