Closed spinmar closed 4 years ago
I'm not sure what issue you are seeing we have been running this for over 6 months on Sports Mole and it seems to work fine (some past examples can be found at https://amp.sportsmole.co.uk/football/live-commentary/ but only live when matches are live)
We use the amp-live-list element extensively and checked it was working when we first set it up, I haven't checked recently though.
Well it seems that google amp cache is not updated with real content. If you go with a browser to amp page all is ok: it is very strange that only we see the problem.
@spinmar that's interesting. The google AMP cache should at least be 1 behind the origin content and get's updated once a user hits the cache page. (and is up to date from then on there)
I have now checked and our pages are also not reloading on the google cache in the timely way they used to:
shows:
which goes to
The page is uploading quickly but you can't refresh the Google Cache page as you get
I'm sure that when I originally tested this last year the behaviour was different.
This will be live for the next 2 hours.
This is definitely a CORS issue having looked at a debug session on my phone:
My current headers are delivered by:
header("Access-Control-Allow-Origin: $HOST"); header("AMP-Access-Control-Allow-Source-Origin: $ORIGIN"); header("Access-Control-Allow-Methods: GET, POST, OPTIONS"); header("Access-Control-Allow-Credentials: true"); header("access-control-allow-headers: Content-Type, Content-Length, Accept-Encoding, X-CSRF-Token"); header("access-control-expose-headers: AMP-Access-Control-Allow-Source-Origin, AMP-Redirect-To");
And the image implaies and some of the instructions imply that one need include the amp cache elements in this unfortunately there is no example of how this works and the syntax
https://www.ampproject.org/docs/guides/amp-cors-requests
implies that it should deal with these domains automatically:
If the Origin header is set:
If the origin does not match one of the following values, stop and return an error response:
*.ampproject.org
*.amp.cloudflare.com
the publisher's origin (aka yours)
where * represents a wildcard match, and not an actual asterisk ( * ).
If the value of the __amp_source_origin query parameter is not the publisher's origin, stop and return an error response.
If the two checks above pass, process the request.
The examples in the linked document show:
HTTP/2 200
access-control-allow-headers: Content-Type, Content-Length, Accept-Encoding, X-CSRF-Token
access-control-allow-credentials: true
access-control-allow-origin: https://ampbyexample.com
amp-access-control-allow-source-origin: https://ampbyexample.com
access-control-allow-methods: POST, GET, OPTIONS
access-control-expose-headers: AMP-Access-Control-Allow-Source-Origin
So any ideas?
@QES my current headers working:
access-control-allow-credentials:true
access-control-allow-origin:https://www-cibercuba-com.cdn.ampproject.org
access-control-expose-headers:AMP-Access-Control-Allow-Source-Origin
amp-access-control-allow-source-origin:https://www.cibercuba.com
amp-same-origin:true
I dont believe CORS plays a part in this as amp-live-list requests are proxied through the cache. Looking into this right now.
Thanks very much to look at it.
@seomaz thanks that is exactly what I have (I think - but with my domains)
When this is cached by the cache the browser needs to know that it is OK - how do you include the Google Cache domains in the headers so that the browsers know to include them?
I think I remembered when this may have stopped working or what has changed since I cofirmed it was working last year.
Since then we have added - amp-access on all our pages and amp-list (in addition to amp-live-list)
So I have worked out there are generic issues which I can now see caused by the amp-access element in the google cache so:
https://www.google.co.uk/amp/s/amp.sportsmole.co.uk/football/arsenal/league-cup/live-commentary/live-commentary-arsenal-vs-man-city_319529.html
I get the error:
Failed to load https://amp.sportsmole.co.uk/amp_ping/82adlfDUPAWVBF7dAF0JXu7hPOhd8yjH9UV4EbGEvImp6oD518g2xlnlRTNs2eIB0.83148165848165421/?rid=82adlfDUPAWVBF7dAF0JXu7hPOhd8yjH9UV4EbGEvImp6oD518g2xlnlRTNs2eIB&uri=https%3A%2F%2Famp.sportsmole.co.uk%2Ffootball%2Farsenal%2Fleague-cup%2Flive-commentary%2Flive-commentary-arsenal-vs-man-city_319529.html&pass=PIK&_=0.43455036688366455&host=amp.sportsmole.co.uk&ht=https:&FB=OK&ref=https%3A%2F%2Famp.sportsmole.co.uk%2Ffootball%2Farsenal%2Fleague-cup%2Flive-commentary%2Flive-commentary-arsenal-vs-man-city_319529.html&dynamic&__amp_source_origin=https%3A%2F%2Famp.sportsmole.co.uk: The 'Access-Control-Allow-Origin' header has a value 'https://amp.sportsmole.co.uk' that is not equal to the supplied origin. Origin 'https://amp-sportsmole-co-uk.cdn.ampproject.org' is therefore not allowed access. Have the server send the header with a valid value, or, if an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.
The underling file being loaded has CORS headers of:
Access-Control-Allow-Credentials:true
access-control-allow-headers:Content-Type, Content-Length, Accept-Encoding, X-CSRF-Token
Access-Control-Allow-Methods:GET, POST, OPTIONS
Access-Control-Allow-Origin:https://amp.sportsmole.co.uk
access-control-expose-headers:AMP-Access-Control-Allow-Source-Origin, AMP-Redirect-To
AMP-Access-Control-Allow-Source-Origin:https://amp.sportsmole.co.uk
What to me looks odd in the above is the issue is described as:
https://amp-sportsmole-co-uk.cdn.ampproject.org
as the problem origin but the URL being loaded from is
That looks like the old domain nomeceture but not sure how that is being used here?
Meanwhile I updated the response header cors. Let's see if it solves the problem
@QES you need to add this domain https://amp-sportsmole-co-uk.cdn.ampproject.org look my example
@seomaz but does that still work on the original origin?
access-control-allow-credentials:true access-control-allow-origin:https://www-cibercuba-com.cdn.ampproject.org access-control-expose-headers:AMP-Access-Control-Allow-Source-Origin amp-access-control-allow-source-origin:https://www.cibercuba.com amp-same-origin:true
my AMP pages are served on amp.sportsmole.co.uk and users view them at that location if I change access-control-allow-origin
from amp.sportsmole.co.uk will it still work correctly when viewed from that origin and what if it was being viewed from a different cache?
Is it possible to have multiple domains on the access-control-allow-origin
for example can you have:
access-control-allow-origin: https://amp.sportsmole.co.uk https://amp-sportsmole-co-uk.cdn.ampproject.org
Is that valid and do you also need to include other potential caches? ie cloudflare?
I have just checked and if I set it to
access-control-allow-origin: https://amp-sportsmole-co-uk.cdn.ampproject.org
Then it breaks when loaded from amp.sportsmole.co.uk
@QES I think that access-control-allow-origin should be set to origin request header value (when amp-same-origin is not set and origin is in the allowed domains)
@QES add this:
access-control-allow-credentials:true
access-control-allow-origin:https://amp-sportsmole-co-uk.cdn.ampproject.org
access-control-expose-headers:AMP-Access-Control-Allow-Source-Origin
amp-access-control-allow-source-origin: https://amp.sportsmole.co.uk
amp-same-origin:true
Reading more about this the problem is that I have to change the
access-control-allow-origin
depending on if it is loaded from
amp.sportsmole.co.uk OR amp-sportsmole-co-uk.cdn.ampproject.org OR amp.sportsmole.co.uk.amp.cloudflare.com etc
Now I think this should be set in the __amp_source_origin element sent from the cache on the URL but in the example I looked at that did not seem to be the case in when looking at it in developer mode.
What is correct is that the URL itself:
Seems to say #origin=https%3A%2F%2Fwww.google.co.uk which isn't useful either and not correct.
While the call to the problem (the amp-access JSON end point) is
Is using:
__amp_source_origin=https%3A%2F%2Famp.sportsmole.co.uk
And so it breaks because the ORIGIN is
It seems that I solved my problem with a correct handle of amp response header. I follow the instructions in this page https://www.ampproject.org/docs/guides/amp-cors-requests and now it seems that Google amp cache has the updated version of page.
Hi @spinmar can you access your amp pages direct as amp pages off an origin and does it work both on the origin and on the amp google cache - if so how have you collected the info needed to modify the headers that are different between these two scenarios?
Hi @spinmar @seomaz @erwinmombay after much playing I think I have worked this out, I do think the documentation at https://github.com/ampproject/amphtml/blob/master/spec/amp-cors-requests.md while improved could be better, not because it is wrong but because it maybe isn't made clear in the text that you need to use the "Request Header" Origin to check and use for this.
With the benefit of hindsight and knowing what I was looking for I found the message in the documentation that is key to making this work.
I think this needs something that highlights the issue in a more forceful way as there are forever people having CORS issues.
Having solved the CORS issues - looking at the amp-live-list issue which this originally was about.
This is STILL an issue looking at a page that updates with the amp-live-list page an example is:
https://www.google.co.uk/amp/s/amp.sportsmole.co.uk/live-scores/
when there are games active there will be updated times every minute or so. These update seamlessly on the origin version:
https://amp.sportsmole.co.uk/live-scores/
Checking the Console and the Network traffic the cache version is getting every 15 seconds the file:
while the ORGIN version is getting
The issue that I notice is that the Request Headers are different and do not include an Origin.
:authority:amp-sportsmole-co-uk.cdn.ampproject.org
:method:GET
:path:/v/s/amp.sportsmole.co.uk/live-scores/?usqp=mq331AQECAEYAQ%3D%3D&_js_v=0.1&_latest_update_time=1520012933&__amp_source_origin=https%3A%2F%2Famp.sportsmole.co.uk
:scheme:https
accept:text/html
accept-encoding:gzip, deflate, br
accept-language:en-US,en;q=0.9
amp-same-origin:true
cache-control:no-cache
cookie:AMP_CANARY=1; AMP_EXP=amp-date-picker
dnt:1
pragma:no-cache
referer:https://amp-sportsmole-co-uk.cdn.ampproject.org/v/s/amp.sportsmole.co.uk/live-scores/?usqp=mq331AQECAEYAQ%3D%3D&_js_v=0.1
user-agent:Mozilla/5.0 (iPhone; CPU iPhone OS 10_3 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) CriOS/56.0.2924.75 Mobile/14E5239e Safari/602.1
Looking further at this the called URL
is containing an empty page with no data when loading from the cache but
loads the page as expected. Not sure what the cache is doing at this point if it is looking for change in some way or doing something clever.
@QES Sorry for the late answer but I didn't see your request. In my live amp page I solved the problem setting correctly the amp response header. It works in google amp cache and directly too. I followed the pseudo code here:
https://www.ampproject.org/docs/guides/amp-cors-requests
to set the response header.
@spinmar hi thanks - I managed eventually to work out the correct incantation for this which seems to be overly obfuscated in the documentation it boils down to:
amp-access-control-allow-source-origin must be YOUR origin the publisher domain access-control-allow-origin must be the location of the file is loaded from (and that must be from an approved list of domains)
However the complication is working out how the cache tells the server that it is this origin, you can check the HEADERS either ORIGIN or REFERER depending on what is set, to me that isn't made clear enough but once you explain it, it becomes obvious :)
Thanks for the feedback and help.
Unfortunately, there is another problem the amp-live-list is not actually getting any data in the subsequent calls to the origin.
access-control-allow-credentials:true
access-control-allow-origin:https://amp-sportsmole-co-uk.cdn.ampproject.org
access-control-expose-headers:AMP-Access-Control-Allow-Source-Origin
amp-access-control-allow-source-origin: https://amp.sportsmole.co.uk
amp-same-origin:true
@spinmar are you seeing your pages load the amp-live-list updates successfully on the google cache?
I'm still not seeing pages being loaded (and I'm not getting underlying CORS errors any more) :)
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This is a high priority issue but it hasn't been updated in awhile. @erwinmombay Do you have any updates?
This is a high priority issue but it hasn't been updated in awhile. @erwinmombay Do you have any updates?
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This is a high priority issue but it hasn't been updated in awhile. @erwinmombay Do you have any updates?
This is a high priority issue but it hasn't been updated in awhile. @erwinmombay Do you have any updates?
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This is a high priority issue but it hasn't been updated in awhile. @erwinmombay Do you have any updates?
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This is a high priority issue but it hasn't been updated in awhile. @erwinmombay Do you have any updates?
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This is a high priority issue but it hasn't been updated in awhile. @erwinmombay Do you have any updates?
This issue doesn't have a category which makes it harder for us to keep track of it. @erwinmombay Please add an appropriate category.
This bug is internally tracked b/124009333 as it is most likely a cache problem
assigning to @twifkak based on b/124009333
Hi @twifkak any news on that issue ? Thanks
@xavierleune There's been some internal design/research, but it's still pending some work to figure out how to do it without causing some other problems. Unfortunately nothing to announce yet.
@twifkak thanks for your feedback. Do you have any details to share about the root causes ? As far as I can see, only a small number of publishers are suffering this issue. Maybe there is some workaround to prevent it from our side ?
@xavierleune Can't share many details. There are multiple caching tiers that are contributing to the problem, and different solutions to different tiers. You can help reduce problems from one of the tiers by making sure you specify a cache-control
with a short s-maxage
(or no s-maxage
and a short max-age
). That will not fully prevent it.
Hi, we have a problem in our amp live football soccer page related to google amp cache. We have the live of italian serie A and serie B football games The user can follow the live without refresh the page: in non amp page we handle it with javascript while in the amp one we use the tag amp-live-list with refresh time of 20 seconds. If the user call the amp page directly there is no problem and the refresh is working well: Example.
https://sport.virgilio.it/dirette/live/serie-a/26-2-2018/cagliari-napoli/3835/amp/
While if the user goes to the page from google search results and then from google cache it seems that the page is not refreshing.
https://www.google.it/amp/s/sport.virgilio.it/dirette/live/serie-a/26-2-2018/cagliari-napoli/3835/amp/
Our server replies always with max-age:0 and then google should not cache the page. Google should always ask the page to our server. During the live the amp version of page is always lagging behind not amp page. Can some explain me where is the problem? Thanks