benetech / UX-Guide-EPUB-A11y-Metadata

User Experience Guide for Displaying Accessibility metadata for EPUB
6 stars 1 forks source link

Search engines crawlers don't use schema.org #5

Closed laudrain closed 5 years ago

laudrain commented 5 years ago

Do we have any assurance that search engines crawler collect any schema.org metadata on books the would find in web pages, whatever the implementation (Microdata, RDFa, JSON-LD)?

Would they do, they would have to decipher which metadata is the true one from tons of data founds in dozens of web pages for the same title...

Web crawlers are not numerous. Have they been contacted to know what they do exactly with Schema.org metadata with: "@context": "http://schema.org", "@type": "Book", ?

gregoriopellegrino commented 5 years ago

You're right: Google, although it suggests to implementers to use Schema.org to collect information about books (https://developers.google.com/search/docs/data-types/book) actually uses ONIX streams (I think the ones received through the Google Books program).

That said: about 22% of web traffic is generated by (good) bots (https://www.imperva.com/blog/bot-traffic-report-2016/?utm_campaign=Incapsula-moved), so I think it might be useful to expose structured information on the accessibility features of documents. We don't know what information these bots collect, but it may be that someone collects structured data and maybe information about accessibility; if not now it will probably happen in the future.

My proposal is to better explain this concept.

gregoriopellegrino commented 5 years ago

Removed the paragraph on Search Engines (for now) with last commit. Will create a separate document.