ct-martin / ctmartin-hugo-theme

Hugo theme for my websites
MIT License
0 stars 0 forks source link

Investigate how to properly put multiple schema types on a page #11

Open ct-martin opened 4 years ago

ct-martin commented 4 years ago

According to https://schema.org/Article, you should put a top-level array with multiple objects...

According to https://schema.org/ScholarlyArticle, you should use hasPart or @graph structures...

According to https://schema.org/ArchiveComponent, you should pass an array for the @type parameter...

According to https://schema.org/WebPage, "Every web page is implicitly assumed to be declared to be of type WebPage"...

According to https://schema.org/CreativeWork, all of the above are valid...

So which should I use?


More rambling:

https://schema.org/mainEntity and https://schema.org/mainEntityOfPage make it unclear whether you should declare a WebPage about a thing or a thing represented by a WebPage...

https://schema.org/url says just throw a url on whatever

https://github.com/schemaorg/schemaorg/issues/1769 has multiple ways to do an ImageGallery

ct-martin commented 4 years ago

I found https://www.w3.org/TR/json-ld/ and JSON-LD is complicated... anyway, arrays at the top-level are meant to give a disconnected graph (items in the array aren't necessarily related). Using an array for @type is discouraged, but not disallowed... I need to think about how this will fit into articles, e.g. blog posts that are also technical, for deciding what makes the most sense.

Based on https://developers.google.com/search/docs/data-types/article, it looks like the best approach is to pick the most semantic of the content on the page (ImageGallery, Recipe, Article, etc.), and then have a mainEntityOfPage with a url (if the page type doesn't already inherit WebPage, like ImageGallery). This approach seems confirmed by https://github.com/schemaorg/schemaorg/issues/1115#issuecomment-215633901 , which states:

Some modeling approaches recommend defining both variants of the same relationship type and marking them as inverses of each other, so a computer can infer the same information from either way.

In schema.org, we tend to avoid inverses and aim at defining the more popular direction only (with a few exceptions).

Additionally, based on the GitHub issue linked in the OP, I'm not going to try to include a listing of images in the gallery b/c I lack context and there isn't a clear answer on how to mark up more than one image.

Based on https://schema.org/docs/datamodel.html#mainEntityBackground , url is for the authoritative location of something, so since I reference other things sometimes I shouldn't use this until I figure out some future planning for what I want to do with my site. However, https://developers.google.com/search/docs/data-types/carousel uses the url property on things like Recipe... need to think about how to approach this... additionally, I don't understand @id vs url fully yet; I think is is also partly b/c I'm using schema for web purposes only.

For the overall site, https://schema.org/Blog is a CreativeWork and not a child of WebPage, so if I consider that particular subdomain to be a blog, the home page should be a Blog, otherwise it should be a WebSite? Also, https://schema.org/blogPost indicates that a Blog may list a post, but there's no inverse property; so an Article/BlogPosting does not need to be declared as part of a Blog, it just needs the mainEntityOfPage.

The list pages I'm going to be describing are going to (likely) have mixed content (e.g. maybe an Article and an ImageGallery have the same tag), and the list pages are summaries. https://developers.google.com/search/docs/data-types/carousel wants a list where summaries link to another page, and are of the same type (note that neither of these are Schema restrictions; they're from Google.) Thus, it's unlikely I'll qualify for Carousels anyway. https://schema.org/Collection says that Collections are "A created collection of Creative Works or other artefacts." These lists are technically created, but really are automatically generated. So, if I were to do a series of some sort, that would make sense as a Collection, but otherwise I should stick to an ItemList.

https://developers.google.com/search/docs/data-types/image-license-metadata#multiple-images simply gives an array of images, so I think that for sitemap-ish purposes, that might be the best approach for that use case.

Summary:

ct-martin commented 4 years ago

From b5009ee,

TODO:

  • look at tagging
  • improve WebSite
  • look at author/publisher
  • look at Article typing
  • taxonomies (sections, tags, tag & section listings)