Closed tremby closed 5 years ago
I'm wondering if providing a subdomain which on Cloudflare DNS control points to the base IP but is set to bypass caching (grey bypass arrow) will solve your problem.
I use this to avoid similar issues with sftp for various accesses that need it, such as direct editing, usually for staging when that is on a remote server..
The setting on Cloudflare means that only the DNS is being handled, with no influence from any Cloudflare process possibilities on the data transferred -- which is then actually coming direct from your server. You'd run admin then on the subdomain,
Sorry - above got truncated before sent -- repaired.
Using (AWS) Cloudfront here, not Cloudflare.
This (or just bypassing the CDN entirely) would possibly work in this particular case as long as Craft doesn't build full URLs including the domain when doing things like generating live preview URLs. But I believe it does, and so the requests would just end up being sent to the regular Cloudfront-domain URLs anyway. I wonder if this is a configuration issue.
But bypassing the CDN by using a different subdomain would not help in the case where people are already using different subdomains for different pages on their site, as described here.
My suggested change would, I believe, handle that case too.
Think you've got your names swapped (Cloudfront is AWS), but see your points.
Indeed actually the site itself won't run, much less CP, with a subdomain url. But this is rewritable to the expected url via nginx config or apache .htaccess, no?
I'm kind of wondering if this approach might also remove the subdomains problem you mention also -- maybe you could try it.
Think you've got your names swapped (Cloudfront is AWS), but see your points.
Yes, I noticed after posting and edited it.
Indeed actually the site itself won't run, much less CP, with a subdomain url. But this is rewritable to the expected url via nginx config or apache .htaccess, no?
I think perhaps we don't understand each other.
If I'm understanding your suggestion correctly, I would have a new subdomain origin.example.com
which points to the same IP as Cloudfront points to (in my case this is the load balancer). This would bypass Cloudfront entirely. I understand this and it is in fact something I considered earlier. I initially rejected it because I don't want a separate domain open to the public leading to the same content, and locking it down by IP address and keeping that up to date with the clients' IP addresses would be a nightmare.
But the Craft website is configured to know that its URL is https://www.example.com
, and writes URLs which start with this domain, including the live preview URLs. So even if I can get Craft to respond to https://origin.example.com/admin
and get to an edit page, as soon as I hit "live preview" it'll POST to https://www.example.com/articles/article-1
, which is handled by Cloudfront and will fail.
I'm kind of wondering if this approach might also remove the subdomains problem you mention also -- maybe you could try it.
I don't have a site which has different subdomains for different languages. It's just a problem I foresee (for other people) with having a special "admin" subdomain. You'd need special "admin" subdomains for each language subdomain then, and I'm guessing Craft would have to be aware of the mappings. It seems like it would be overcomplicated.
Do you see any issues with my suggested fix of moving all live preview endpoints to some URL under /admin
?
The reason LP uses the normal entry URL is so any conditions/other logic in templates that factor in the entry URL will work as expected, and any relative CSS/JS URLs will load correctly. We could fake the entry URL for template logic if we switched to something like this, but it would still break relative CSS/JS URLs (not that people should be doing that anyway).
That said it’s worth considering, especially when we get around to #1521 & #1489
About to leave here, but some quick responses:
@narration-sd:
nightmare keeping up moving IPs? Well, that might be true with Cloudfront. On Cloudflare it's very easy. Or maybe just use a CNAME, if double lookup doesn't materially affect access speed? You have to keep up with the IPs anyway, don't you?
Clients' IPs, not server IPs. We wouldn't want a non-cached entry point to the site available to the public. I suppose we could add HTTP basic auth on this special domain and lock it down that way...
I'm not suggesting you make the Craft CP (or site) knowledgeable about multiple (sub)domains. I'm suggesting you rewrite them in server setup so that Craft always sees the canonical one. Again, nginx config or apache htaccess rules.
But as I said, Craft sends the POST request to its canonical URL, not to the current one, so how would the live previews hit origin.example.com
? Perhaps there's some server config you're referring to I'm not aware of which rewrites domains found in HTTP responses?
@brandonkelly:
The reason LP uses the normal entry URL is so any conditions/other logic in templates that factor in the entry URL will work as expected, and any relative CSS/JS URLs will load correctly. We could fake the entry URL for template logic if we switched to something like this, but it would still break relative CSS/JS URLs
Interesting. Would adding a <base>
tag to the outgoing HTML containing the entity's public URL solve this?
(not that people should be doing that anyway).
Agreed. I wouldn't be against breaking live preview for these users and documenting it...
@tremby sorry, really am in a hurry .Think this looks good on quick scan, about how it's done on nginx, and similarly enough for apache.
On the auth/available problem, not sure I see that if SEO (answerable via canonical would seem) isn't the issue, but that's up to you and client of course - a pleasant smile :)
As well, seems you could with related server config also redirect if someone accessed the non-cached url not as a CP request, which wouldn't need the headache of auth. Given Craft is set up properly with regard to clicking on the Site name from CP, should work even there?
Best fortune, in any case, Bart, and you see Craft principals are already on the case...!
Interesting. Would adding a
tag to the outgoing HTML containing the entity's public URL solve this?
Probably but we'd also need to check if they already define a <base>
tag, and not really interested in messing with peoples’ HTML.
@narration-sd:
Think this looks good on quick scan, about how it's done on nginx, and similar enough for apache.
Sorry, but I really still don't see how this helps. Aren't these just regular rewrite rules, which change how the server interprets a request URL or cause an HTTP redirection to elsewhere? The problem is the URLs the server (Craft) is sending back out. In the Javascript chunk for live previews, for example, it gives the URL to send the POST request to as a full URL with domain, and that domain is the canonical domain. So the Javascript will then send the POST request to the canonical domain.
Are you suggesting I set up rewrite rules on the www.example.com
server so that POST requests to URLs such as /articles/article-1
get redirected to origin.example.com
? When most (all?) clients get a 301 or 302 response they will send again to the new URL with GET rather than POST (for security reasons, I think), so I'm not sure this would work.
Definitely crossed wires here, and my hurry isn't helping.
Ok, and that really is all I can cook up for now. Fire away though for the thread where I'm missing something, as likelihood seems reasonably high at this level of handwaving -- cheers
Craft will reply leading to request of the canonical url, and that looks like it wouild cause cross-domain issues for the Live View panel, thus failure in a different mode than your original problem.
They'd have a common ancestor domain, so with CORS and cookie domain set up correctly that shouldn't be an issue. It's still the issue that the POST will go to the canonical URL, and that this will point at Cloudfront.
if it would work (needs trial to verify no presently-unseens interfere), maybe the simplest thing Craft could do as a feature to help is prefix Live Preview paths with a known (configurable general.php) pseudopath, making the case easier to recognize.
This is exactly what my original post here is suggesting, but which Brandon points out could break theme logic and relative CSS/JS URLs.
In the mean time I'm exploring some solutions. The current one I'm exploring is a small opinionated patch on Craft CMS (:-1:) to alter the preview URLs sent out, combined with some nginx configuration. If I come up with a working solution I'll post it, but as yet I've found no way to do this without altering Craft code.
OK, my current solution is this:
Set up a new subdomain and server block origin.example.com
. In my case this has the same document root as the main server block which Cloudfront points at, but has a few differences:
It has CORS headers set up (as documented)
It redirects GET
and HEAD
requests to the canonical domain via 301
responses:
location / {
if ($request_method ~ "^(GET|HEAD)$") {
return 301 $scheme://www.example.com$request_uri;
}
# Regular Craft rule:
try_files $uri $uri/ /index.php?$query_string;
}
It doesn't have any of the more specific directives specified in the main www
server section which would only apply to GET
requests anyway, like responding specially with far-future expiry dates to requests for asset files. It still does have the block responding to .php
files, setting up fasctcgi_pass
etc. If there's a nicer way I'd love to hear it. It feels to me like there ought to be a way to write this server block in 5 lines or so without repeating so much stuff from the main server block.
Set up Craft to use the shared ancestor domain for cookies, as documented, so in this case that's .example.com
.
Add a new environment variable PREVIEW_URL
, which points to the admin domain, PREVIEW_URL=https://origin.example.com
. In non-production environments this can be blank to behave like normal.
Add a small patch to each craft/app/controllers/EntriesController.php:254
and craft/app/controllers/CategoriesController.php:286
, inside the "Enable Live Preview?" conditionals. (This code also uses a CRAFT_URL
environment variable, and I'm not sure if this is standard or just in our application. It's the scheme and domain of the site with no trailing slash. craft()->getSiteUrl()
gave me something including a trailing slash, which was not ideal.)
diff --git a/craft/app/controllers/CategoriesController.php b/craft/app/controllers/CategoriesController.php
index 5514fe9..35a1919 100644
--- a/craft/app/controllers/CategoriesController.php
+++ b/craft/app/controllers/CategoriesController.php
@@ -286,10 +286,19 @@ class CategoriesController extends BaseController
// Enable Live Preview?
if (!craft()->request->isMobileBrowser(true) && craft()->categories->isGroupTemplateValid($variables['group']))
{
+ $previewUrl = $variables['category']->getUrl();
+ $previewBase = getenv('PREVIEW_URL');
+ if ($previewBase) {
+ if ($previewUrl[0] === '/') {
+ $previewUrl = $previewBase . $previewUrl;
+ } else {
+ $previewUrl = str_replace(getenv('CRAFT_URL'), $previewBase, $previewUrl);
+ }
+ }
craft()->templates->includeJs('Craft.LivePreview.init('.JsonHelper::encode(array(
'fields' => '#title-field, #fields > div > div > .field',
'extraFields' => '#settings',
- 'previewUrl' => $variables['category']->getUrl(),
+ 'previewUrl' => $previewUrl,
'previewAction' => 'categories/previewCategory',
'previewParams' => array(
'groupId' => $variables['group']->id,
diff --git a/craft/app/controllers/EntriesController.php b/craft/app/controllers/EntriesController.php
index bd20a2e..691a0bc 100644
--- a/craft/app/controllers/EntriesController.php
+++ b/craft/app/controllers/EntriesController.php
@@ -254,10 +254,19 @@ class EntriesController extends BaseEntriesController
// Enable Live Preview?
if (!craft()->request->isMobileBrowser(true) && craft()->sections->isSectionTemplateValid($variables['section']))
{
+ $previewUrl = $variables['entry']->getUrl();
+ $previewBase = getenv('PREVIEW_URL');
+ if ($previewBase) {
+ if ($previewUrl[0] === '/') {
+ $previewUrl = $previewBase . $previewUrl;
+ } else {
+ $previewUrl = str_replace(getenv('CRAFT_URL'), $previewBase, $previewUrl);
+ }
+ }
craft()->templates->includeJs('Craft.LivePreview.init('.JsonHelper::encode(array(
'fields' => '#title-field, #fields > div > div > .field',
'extraFields' => '#settings',
- 'previewUrl' => $variables['entry']->getUrl(),
+ 'previewUrl' => $previewUrl,
'previewAction' => 'entries/previewEntry',
'previewParams' => array(
'sectionId' => $variables['section']->id,
This patch changes the domain of the live preview URLs passed to the admin panel, so that when a live preview is requested the POST request goes to the alternative domain.
The browser sends the auth cookie through thanks to the cookie domain being an ancestor of both sites, and the alternative domain allows the POST request due to its CORS headers. The alternative domain hits PHP with the request with serves the response. Any GET requests which come through to the alternative domain (maybe for assets and that kind of thing, or followed links to other pages) will get 301
responses and so be redirected to go through the cache.
I don't much like this solution.
(couldn't get out of here, realized it will have to be tomorrow)
I think Brandon indicated he could deal with the offset paths.
Relative inclusion urls in terms of /path fixed to site root will be ok, and surely with Craft there's no impetus to write otherwise, with tools or if not using its services.
And you're right -- able to read more carefully, and include some assumptions, the above isn't different from one of your proposals,
I do like such better than the latest solution, though appreciate the thought in the effort.
That said, will have voted your feature.
I entered into this to spare Bend Central some effort. I think we accomplished that, so good.
something to be said surely again, for the amount of detail such 'simple' adjustments need to consider....
Regards, Clive
I agree with everything you've said. I hope changing the endpoint for live previews is something which might eventually make it into Craft. I guess it's just a matter of making a decision about whether it's okay to break live previews for sites with relative asset URLs.
In the mean time, I wonder if there are any ideas about how I might alter my above solution so that no files from Craft itself are modified? Is there a way I can achieve the same thing by configuration and possibly extending classes rather than editing core files?
You know, I had another quick go at this and discovered Cloudfront rule matching is pretty unelaborate, no doubt for the efficiency - no regex-like abilities. A glance at CloudFlare doesn't show much better.
Reading once again your original problem statement, though, might you be able to solve by simply excluding any CP (admin) paths from caching?
At first glance this looks unsupported, as there isn't a NOT operator in path patterns, but then I stumbled across something which you can find on this page: Developer Guide
If you want to apply a different cache behavior to the files in the images/product1 directory than the files in the images and images/product2 directories, create a separate cache behavior for images/product1 and move that cache behavior to a position above (before) the cache behavior for the images directory.
So, could your solution be to encode a rule matching the Craft admin introducer, which doesn't cache, and put that rule before others that do?
...I am quite possibly being too simpleminded about this, and also this time jumping right out the door, but you can look at it...
That doesn't help. The POST
requests for live preview pages are not to addresses under the /admin
prefix. That's what this entire thread is about.
Hmm, yes, you did say that at the outset. I was in fact fishing around in the code at some rapid rate this afternoon and wanted to think it said something else, but what I saw makes more sense this way. I believe I might have misinterpreted your example because it looked like (though reversed of) the number-hyphen addition the CP puts on the entry slug when editing it; was thinking too much about that.
Ok. The fact that the Live Preview path, rather than looking like the edit path, is indistinguishable from actual live, means nothing in string matching is going to help, without the addition your feature request asks. If we want to try so before it might be realized, we have to trigger on recognizing that it's a POST.
I had a look at your patch solution. You might be able to work out something with Composer to substitute your fork of the code instead of disturbing distribution files, at least on Craft 3. It looks like this is a reasonable explanation of the 'replace' ability to do this: Composer Replace.
I don't believe such is going to be possible on Craft 2 because you don't have a loader managing affairs. Something like a Git pull that you run over code releases each time might work for Craft 2, but I don't think C2 is public on git.
There is a sort of fascinating (and deadly) patcher I have used on Ansible, LineInFile, which could possibly be usable, if not going to be clear with you if there are source changes in the area. You might want to just manage by diffs and keep it simple, if you are going to patch C2.
Now, what about another alternative? I had already the scent of Lambda@Edge, just thinking it was probably too heavy to use on your problem. But in fact, it appears to be set up to be very light weight. I'll have to be careful what I propose here, as haven't looked at more than the surface of it, but it appears to be well integrated by intent with Cloudfront, designed to be used at very high access rates for diversions as we are looking for, and to be able to pick off and modify or substitute responses at any of four stages.
I'm proposing that you might use a Lambda function to recognize on the fact that there's a POST going on; possibly also that this is an xhr/ Ajax call, and I think it will be necessary to see it's not an action request (recognize by path), the other use of Ajax which might have POST. Codewise pretty minimal.
Then you would (i'm not sure this one is possible) trip out of your sessionId-cookie-destroying rule, or indeed just pass the request to the server, and return its response I think through the Lambda function.
You can see I'm hazy on details, at the limit of what I can put into this today, but my sense is that one of these or both ought to be possible, and that it ought to be worth looking at, quite possibly constructing up a model. Looks like Lambda code is Node code, and they have some examples among the basic materials here (not just in the Examples section): Using Cloudfront with Lambda@Edge .
Very interested actually in what you might make of one of these paths, Bart.
p.s. just for completeness I went out looking for any equivalent services on CloudFlare. I didn't find any, though there are a number of initiatives for public domain or commercial Lambda alternatives on platforms, not so far along yet it seemed. However, CloudFlare has by now long had Railgun, a different take on active page caching, which I once tried to pretty nice results for a site needing European presence, potentially serving especially Russia. Using CloudFlare, you could go to the simple alternative uncached domain equivalent I early mentioned here, and as you actually use in your patch scheme above. Probably not attractive since you appear embedded with AWS, but any of these mentioned could possibly be worth a little keeping track of, no?
The Lambda@Edge idea is interesting, and I'll look into it. Thanks.
Good enough.
I had a flash this morning on why CloudFlare/Railgun didn't seem to have problems that you run into with the SessionId cookie. It's because it sends back only the delta for the cached item each time. This means you pay at least ttfb including latency for each page access, so it is not quite as effective as a cache, though they employ a pre-http/2-spdy persistent channel that speeds up any other delta.
Still, this flavor might interest, if you don't get somewhere positive with Lambda for example. If you wanted to try an experiment with it, Arcustech who runs Craft's own site and has a long good history with the community offers Railgun-included even on lowest-price hosting; you just have to ask for it.
I had a flash this morning on why CloudFlare/Railgun didn't seem to have problems that you run into with the SessionId cookie. It's because it sends back only the delta for the cached item each time.
I don't understand.
If you are sending a POST
request for a live preview, and your CDN is filtering out the session ID cookie, you will get a redirection to the log in page. How does what you're describing get around that?
In my present view of how Railgun works, nothing is filtered out -- that was the basis of the flash.
SessionId, CSRF, and any other changed cookies would all come through as efficient deltas, with the minimal delay of one accelerated server access, then patched into the cached delivery of all the rest.
I looked up Railgun. I appears to be a compression scheme between the origin server and the CDN. It's not something we're interested in, but thanks.
Well, I think you'll find it's actually a cache -- overlaid with the accelerated delta of anything that does actually change on active pages. A careful read of first section here?
https://www.cloudflare.com/docs/railgun/intro.html
The normal cache operation occurs for non-active accesses like images, js, etc..
But anyway, of course you have to see what's good for your purpose...
I did read it. We want public pages which are essentially static (which is everything but admin pages) to be cached too. Requests for each should not hit the origin server more than once per hour or so. Even with Railgun the origin server is still fully processing each request and building the response, and then additionally the Railgun daemon is processing or diffing it in whatever way Cloudflare has devised and sending a smaller payload back to the CDN.
All true, and only you know how efficient you want to be.
It's just natural on long experience to question the degree necessary -- realities vs. ideal concept images -- but probably you and team are well aware of that also, as a 'disease', 'sometimes' occuring in software circles :)
Bart, you may well be tired of hearing from me on this, but something possibly interesting got pushed to my phone by Google Now over the weekend.
http://codingthesmartway.com/introduction-to-server-rendered-vue-js-apps-with-nuxt/
Given you could fit with Vue, nuxt-generate would produce a completely static image of your site at will.
Given your site is so largely static as you indicate, may be a way to quite simplify matters?
The developers look an interesting bunch, in any case.
Their doc site here is running this kind of static arrangement, and turns out they even use a Lambda script to cleanly deploy it.
As well, the Vuejs-nuxt discussion of async data suggests a couple of things:
Not delving into this, but quite possibly /action could be useful this way also for protecting any actions that would like to have session information...whether that comes up now or in the future.
The agency I'm contracting for launched a new site today, this time with Craft 3. It seems nothing changed on this topic between Craft 2 and 3 -- still no previews possible with the CDN setup we want.
So I had a go at making the Lambda@Edge function suggested above and have that working.
Cloudfront is set up like this:
This Lambda function simply removes any cookies from incoming HEAD and GET requests, and leaves others alone. Here it is:
exports.handler = async (event) => {
// See https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-event-structure.html
const request = event.Records[0].cf.request;
if (['HEAD', 'GET'].includes(request.method)) {
// These methods are cacheable and we don't ever take the cookies into
// account. Remove them from the headers.
delete request.headers.cookie;
}
// For other methods like POST we want to take cookies into account, since
// they matter for page previews (and possibly in other cases).
return request;
};
This has the effect of allowing cookies through for POST requests for the default ruleset, which is exactly what we need for previews. Cookies are removed from GET and HEAD requests, so they can be cached as they usually would be.
@tremby this is basically what I tend to do when working with Varnish ... I only cache HEAD and GET then strip all cookies from most requests and just let action ones keep their cookies and bypass the cache.
I did want to try and use something more off the shelf like CloudFront but keep steering away given how finicky you have to get with all this! Good to see your solution.
We’ve made checking for Live Preview requests in Cloudfront/CloudFlare much easier in Craft 3.1, as now Live Preview requests are sent with X-Craft-Token
headers (a4e5c177955852e7beea75cc0c027acdb2ef48d1). So now you just need to check for that.
Currently live previews work by making a POST request to an entity's public URL.
For example, an entity which would normally be publicly visible via a GET request to
/articles/article-1
is live previewed by making a POST request to/articles/article-1
while logged in as an admin.This causes live preview to fail when using Cloudfront (a caching CDN) for delivery of the website:
/admin/
prefix./articles/article-1
we either have to always drop the session ID cookie or never drop it. But it renders Cloudfront pretty useless if we never drop it.If the endpoint for all live previews was at a particular URL (something under
/admin
would be appropriate, since it's an admin function) we could add a Cloudfront rule to allow the cookie for POST requests there, and live previews would work.Some suggestions:
POST /admin/live-preview?url=%2Farticles%2Farticle-1
with no change to posted dataPOST /admin/live-preview
with something likeurl=/articles/article-1
added to the posted dataThis shouldn't affect cross-domain live previews; those could either be sent to the other domain's
/admin/live-preview
endpoint, or to the current domain's one with the full URL including domain in the data being passed.