ampproject / amp-toolbox

A collection of AMP tools making it easier to publish and host AMP pages.
Apache License 2.0
450 stars 242 forks source link

ITI: Asset URL rewriting in AMP Optimizer #862

Open jonluca opened 4 years ago

jonluca commented 4 years ago

Moving the conversation here based on https://github.com/ampproject/amppackager/issues/449

We’d like to move forward with adding AMP Packager compatibility mode to the AMP optimizer.

From the previous issue:

The important thing to note is that in the near future, transformers won't need to produce byte compatible AMPHTML to the AMPPackager. This will greatly simplify things and means that transformers only need to produce valid AMP. So you will be able to use AMP Optimizer out of the box. However, for performance reasons, asset URL rewriting to cdn.ampproject.org is critical in the SXG case and should be added to AMP Optimizer. This could be enabled via a flag as an "AMPPacker compatibility mode".

We feel comfortable in adding in a new flag ( --amp-packager-mode?). The base URL would default to cdn.ampproject.org, but could be changed to support different AMP caches (is this what the AMP-Cache-Transform request header is for?).

sebastianbenz commented 4 years ago

Sounds good! I'd prefer to name the flag more generic though: rewriteUrlsToAmpCache=cdn.ampproject.org?

One note: rewriting script import URLs is already supported via the ampUrlPrefix flag. If the packager flag is set this should be overridden (and print a warning if it is defined as well).

jonluca commented 4 years ago

Is there a reason we shouldn't just use the ampUrlPrefix flag? It seems like the behavior is going to be the same. Is ampUrlPrefix used elsewhere and shouldn't be overloaded?

Also, does a mapping of amp-cache-transform headers to URLs exist somewhere already? Took a look in the amp-packager repo but didn't see anything of note.

sebastianbenz commented 4 years ago

ampUrlPrefix is meant for runtime self-hosting. In that case you want to rewrite all script srcs, but not all asset links.

I'm not aware of an existing amp-cache-transform header mapping - afaik, the Google AMP Cache is the only one supporting SXG.

twifkak commented 4 years ago

The amp-cache-transform header mapping is caches.json, as specified (though I'm open to PRs against the spec). For ease of implementation, this is also mirrored at https://cdn.ampproject.org/caches.json.

In practice, AMP Packager is hard-coded to support only google but that's an open bug. No need to reproduce that buggy behavior.