Why slow down page delivery for 99.9% of users when only 0.1% cases needs the rich data

Mixing the metadata with the HTML makes templates rendering such output heavy and error prone. All the clutter added is only increasing page weight (over the wire and in browser memory) for most users (the numbers may vary)
The extraction of the structured data is still quite complex and error prone as well.

Have you considered CorssRef proposals?http://crosstech.crossref.org/2010/03/dois_and_linked_data_some_conc.html

It is way easier to expose in the HTML link to alternative forms of the document in machine readable formats. Either with HTML <link rel="alternate" type="application/rdf+xml" href=”http://dx.doi.org/10.1126/science.1157784” title=”RDF/XML version of this document”/> Or using HTTP headers Link: http://dx.doi.org/10.1126/science.1157784; rel="alternate"; type="application/rdf+xml"; title="big print"

It has minimal overhead for un-interested parties
It is easy to discover for interested parties.
The header variant doesn't need to transfer any HTML at all - the crawler can use HEAD request instead of GET, the HTML link variant can stop parsing and close connection as soon as it finds link pointing to format it can consume.
The linked document is easier to make, validate (do it right) and easier to consume
The site can easily expose the document in various machine readable flavors with very little effort
If you deposit to CR, you can leverage the work they did and just create the link - 0 overhead to your infrastructure
If you want to create and host these documents yourself, you can put them on CDN with your backend workflow. These documents are very easy to cache as they hardly ever change - unlike the HTML structure which is changing on daily basis and often have many flavors for various article views.
You can also host the RDF documents on the same URL as the HTML, and use HTTP Accept header to negotiate the format you prefer for example curl -D – -L -H "Accept: application/rdf+xml" "http://dx.doi.org/10.1126/science.1157784"

There are several points of disagreement here. I will try to address them one by one.

Page Weight

The impact on page weight is actually minimal. Making use of embedded linked data allows us to style everything by targeting the semantic properties and therefore to use no class whatsoever. Not only is the weight difference compared to using class minimal (in fact it might be favourable) but it also enforces good practices in styling in that they stick to the semantics instead of picking the "easy way out". Additionally, a baseline CSS can be shared (and cached) across all uses, further reducing impact (see #25).

Templating Difficulty

I have already implemented this twice, in two different templating systems. I don't see the difficulty.

Data Extraction

You are not comparing apples to apples. Getting only the metadata about a document from a separate file will always be easier than getting all the data. But our use case is to expose all the data. You could design a format that would feature the full content of a Scholarly HTML document, but then it would lose all the advantages of being HTML.

Link Headers

Link headers are authoritative metadata, which means that when you lose them you lose the ability to process your content correctly. It's an antipattern. See this thread for further discussion.

Who the Consumers Are

You seem to assume that the consumers of the data are only interested in the metadata and are some sort of specialised crawler that could support ad-hoc rules discussed in a CR blog post. The consumers for this information both want the whole thing and/or are general-purpose implementations. For the former case, authoring tools and scholarly tools in general need to operate on the full content with the full information. Getting just the metadata is of limited use. And our crawlers are general-purpose schema.org processors, they can actually use the content, which they wouldn't do with the CR hack.

CrossRef Centralisation

The whole point of using the Web is decentralisation. Building an architecture for scholarly information around a single point of failure (what's more, one that already often times out) is a recipe for failure.

Changing the HTML Structure

I don't know why you would have the HTML structure changing on a daily basis, and certainly not why there would by multiple flavours. That is just not how Web sites are built.

Content Negotiation

Content Negotiation is not a solution to anything; it just adds an extra problem. Keeping multiple semantic formats in alignment is a very risky idea, it's a great way to produce hard-to-find bugs.

scienceai / scholarly.vernacular.io