craftcms / cms

Build bespoke content experiences with Craft.
https://craftcms.com
Other
3.27k stars 635 forks source link

Use URL under /admin for live previews #1758

Closed tremby closed 5 years ago

tremby commented 7 years ago

Currently live previews work by making a POST request to an entity's public URL.

For example, an entity which would normally be publicly visible via a GET request to /articles/article-1 is live previewed by making a POST request to /articles/article-1 while logged in as an admin.

This causes live preview to fail when using Cloudfront (a caching CDN) for delivery of the website:

If the endpoint for all live previews was at a particular URL (something under /admin would be appropriate, since it's an admin function) we could add a Cloudfront rule to allow the cookie for POST requests there, and live previews would work.

Some suggestions:

This shouldn't affect cross-domain live previews; those could either be sent to the other domain's /admin/live-preview endpoint, or to the current domain's one with the full URL including domain in the data being passed.

narration-sd commented 7 years ago

I'm wondering if providing a subdomain which on Cloudflare DNS control points to the base IP but is set to bypass caching (grey bypass arrow) will solve your problem.

I use this to avoid similar issues with sftp for various accesses that need it, such as direct editing, usually for staging when that is on a remote server..

The setting on Cloudflare means that only the DNS is being handled, with no influence from any Cloudflare process possibilities on the data transferred -- which is then actually coming direct from your server. You'd run admin then on the subdomain,

narration-sd commented 7 years ago

Sorry - above got truncated before sent -- repaired.

tremby commented 7 years ago

Using (AWS) Cloudfront here, not Cloudflare.

This (or just bypassing the CDN entirely) would possibly work in this particular case as long as Craft doesn't build full URLs including the domain when doing things like generating live preview URLs. But I believe it does, and so the requests would just end up being sent to the regular Cloudfront-domain URLs anyway. I wonder if this is a configuration issue.

But bypassing the CDN by using a different subdomain would not help in the case where people are already using different subdomains for different pages on their site, as described here.

My suggested change would, I believe, handle that case too.

narration-sd commented 7 years ago

Think you've got your names swapped (Cloudfront is AWS), but see your points.

Indeed actually the site itself won't run, much less CP, with a subdomain url. But this is rewritable to the expected url via nginx config or apache .htaccess, no?

I'm kind of wondering if this approach might also remove the subdomains problem you mention also -- maybe you could try it.

tremby commented 7 years ago

Think you've got your names swapped (Cloudfront is AWS), but see your points.

Yes, I noticed after posting and edited it.

Indeed actually the site itself won't run, much less CP, with a subdomain url. But this is rewritable to the expected url via nginx config or apache .htaccess, no?

I think perhaps we don't understand each other.

If I'm understanding your suggestion correctly, I would have a new subdomain origin.example.com which points to the same IP as Cloudfront points to (in my case this is the load balancer). This would bypass Cloudfront entirely. I understand this and it is in fact something I considered earlier. I initially rejected it because I don't want a separate domain open to the public leading to the same content, and locking it down by IP address and keeping that up to date with the clients' IP addresses would be a nightmare.

But the Craft website is configured to know that its URL is https://www.example.com, and writes URLs which start with this domain, including the live preview URLs. So even if I can get Craft to respond to https://origin.example.com/admin and get to an edit page, as soon as I hit "live preview" it'll POST to https://www.example.com/articles/article-1, which is handled by Cloudfront and will fail.

I'm kind of wondering if this approach might also remove the subdomains problem you mention also -- maybe you could try it.

I don't have a site which has different subdomains for different languages. It's just a problem I foresee (for other people) with having a special "admin" subdomain. You'd need special "admin" subdomains for each language subdomain then, and I'm guessing Craft would have to be aware of the mappings. It seems like it would be overcomplicated.

Do you see any issues with my suggested fix of moving all live preview endpoints to some URL under /admin?

brandonkelly commented 7 years ago

The reason LP uses the normal entry URL is so any conditions/other logic in templates that factor in the entry URL will work as expected, and any relative CSS/JS URLs will load correctly. We could fake the entry URL for template logic if we switched to something like this, but it would still break relative CSS/JS URLs (not that people should be doing that anyway).

That said it’s worth considering, especially when we get around to #1521 & #1489

narration-sd commented 7 years ago

About to leave here, but some quick responses:

tremby commented 7 years ago

@narration-sd:

nightmare keeping up moving IPs? Well, that might be true with Cloudfront. On Cloudflare it's very easy. Or maybe just use a CNAME, if double lookup doesn't materially affect access speed? You have to keep up with the IPs anyway, don't you?

Clients' IPs, not server IPs. We wouldn't want a non-cached entry point to the site available to the public. I suppose we could add HTTP basic auth on this special domain and lock it down that way...

I'm not suggesting you make the Craft CP (or site) knowledgeable about multiple (sub)domains. I'm suggesting you rewrite them in server setup so that Craft always sees the canonical one. Again, nginx config or apache htaccess rules.

But as I said, Craft sends the POST request to its canonical URL, not to the current one, so how would the live previews hit origin.example.com? Perhaps there's some server config you're referring to I'm not aware of which rewrites domains found in HTTP responses?

tremby commented 7 years ago

@brandonkelly:

The reason LP uses the normal entry URL is so any conditions/other logic in templates that factor in the entry URL will work as expected, and any relative CSS/JS URLs will load correctly. We could fake the entry URL for template logic if we switched to something like this, but it would still break relative CSS/JS URLs

Interesting. Would adding a <base> tag to the outgoing HTML containing the entity's public URL solve this?

(not that people should be doing that anyway).

Agreed. I wouldn't be against breaking live preview for these users and documenting it...

narration-sd commented 7 years ago

@tremby sorry, really am in a hurry .Think this looks good on quick scan, about how it's done on nginx, and similarly enough for apache.

https://www.digitalocean.com/community/tutorials/how-to-create-temporary-and-permanent-redirects-with-nginx

On the auth/available problem, not sure I see that if SEO (answerable via canonical would seem) isn't the issue, but that's up to you and client of course - a pleasant smile :)

As well, seems you could with related server config also redirect if someone accessed the non-cached url not as a CP request, which wouldn't need the headache of auth. Given Craft is set up properly with regard to clicking on the Site name from CP, should work even there?

Best fortune, in any case, Bart, and you see Craft principals are already on the case...!

brandonkelly commented 7 years ago

Interesting. Would adding a tag to the outgoing HTML containing the entity's public URL solve this?

Probably but we'd also need to check if they already define a <base> tag, and not really interested in messing with peoples’ HTML.

tremby commented 7 years ago

@narration-sd:

Think this looks good on quick scan, about how it's done on nginx, and similar enough for apache.

https://www.digitalocean.com/community/tutorials/how-to-create-temporary-and-permanent-redirects-with-nginx

Sorry, but I really still don't see how this helps. Aren't these just regular rewrite rules, which change how the server interprets a request URL or cause an HTTP redirection to elsewhere? The problem is the URLs the server (Craft) is sending back out. In the Javascript chunk for live previews, for example, it gives the URL to send the POST request to as a full URL with domain, and that domain is the canonical domain. So the Javascript will then send the POST request to the canonical domain.

Are you suggesting I set up rewrite rules on the www.example.com server so that POST requests to URLs such as /articles/article-1 get redirected to origin.example.com? When most (all?) clients get a 301 or 302 response they will send again to the new URL with GET rather than POST (for security reasons, I think), so I'm not sure this would work.

narration-sd commented 7 years ago

Definitely crossed wires here, and my hurry isn't helping.

Ok, and that really is all I can cook up for now. Fire away though for the thread where I'm missing something, as likelihood seems reasonably high at this level of handwaving -- cheers

tremby commented 7 years ago

Craft will reply leading to request of the canonical url, and that looks like it wouild cause cross-domain issues for the Live View panel, thus failure in a different mode than your original problem.

They'd have a common ancestor domain, so with CORS and cookie domain set up correctly that shouldn't be an issue. It's still the issue that the POST will go to the canonical URL, and that this will point at Cloudfront.

if it would work (needs trial to verify no presently-unseens interfere), maybe the simplest thing Craft could do as a feature to help is prefix Live Preview paths with a known (configurable general.php) pseudopath, making the case easier to recognize.

This is exactly what my original post here is suggesting, but which Brandon points out could break theme logic and relative CSS/JS URLs.

In the mean time I'm exploring some solutions. The current one I'm exploring is a small opinionated patch on Craft CMS (:-1:) to alter the preview URLs sent out, combined with some nginx configuration. If I come up with a working solution I'll post it, but as yet I've found no way to do this without altering Craft code.

tremby commented 7 years ago

OK, my current solution is this:

  1. Set up a new subdomain and server block origin.example.com. In my case this has the same document root as the main server block which Cloudfront points at, but has a few differences:

    1. It has CORS headers set up (as documented)

    2. It redirects GET and HEAD requests to the canonical domain via 301 responses:

      location / {
          if ($request_method ~ "^(GET|HEAD)$") {
              return 301 $scheme://www.example.com$request_uri;
          }
      
          # Regular Craft rule:
          try_files $uri $uri/ /index.php?$query_string;
      }
    3. It doesn't have any of the more specific directives specified in the main www server section which would only apply to GET requests anyway, like responding specially with far-future expiry dates to requests for asset files. It still does have the block responding to .php files, setting up fasctcgi_pass etc. If there's a nicer way I'd love to hear it. It feels to me like there ought to be a way to write this server block in 5 lines or so without repeating so much stuff from the main server block.

  2. Set up Craft to use the shared ancestor domain for cookies, as documented, so in this case that's .example.com.

  3. Add a new environment variable PREVIEW_URL, which points to the admin domain, PREVIEW_URL=https://origin.example.com. In non-production environments this can be blank to behave like normal.

  4. Add a small patch to each craft/app/controllers/EntriesController.php:254 and craft/app/controllers/CategoriesController.php:286, inside the "Enable Live Preview?" conditionals. (This code also uses a CRAFT_URL environment variable, and I'm not sure if this is standard or just in our application. It's the scheme and domain of the site with no trailing slash. craft()->getSiteUrl() gave me something including a trailing slash, which was not ideal.)

    diff --git a/craft/app/controllers/CategoriesController.php b/craft/app/controllers/CategoriesController.php
    index 5514fe9..35a1919 100644
    --- a/craft/app/controllers/CategoriesController.php
    +++ b/craft/app/controllers/CategoriesController.php
    @@ -286,10 +286,19 @@ class CategoriesController extends BaseController
            // Enable Live Preview?
            if (!craft()->request->isMobileBrowser(true) && craft()->categories->isGroupTemplateValid($variables['group']))
            {
    +           $previewUrl = $variables['category']->getUrl();
    +           $previewBase = getenv('PREVIEW_URL');
    +           if ($previewBase) {
    +               if ($previewUrl[0] === '/') {
    +                   $previewUrl = $previewBase . $previewUrl;
    +               } else {
    +                   $previewUrl = str_replace(getenv('CRAFT_URL'), $previewBase, $previewUrl);
    +               }
    +           }
                craft()->templates->includeJs('Craft.LivePreview.init('.JsonHelper::encode(array(
                    'fields'        => '#title-field, #fields > div > div > .field',
                    'extraFields'   => '#settings',
    -               'previewUrl'    => $variables['category']->getUrl(),
    +               'previewUrl'    => $previewUrl,
                    'previewAction' => 'categories/previewCategory',
                    'previewParams' => array(
                                           'groupId'    => $variables['group']->id,
    diff --git a/craft/app/controllers/EntriesController.php b/craft/app/controllers/EntriesController.php
    index bd20a2e..691a0bc 100644
    --- a/craft/app/controllers/EntriesController.php
    +++ b/craft/app/controllers/EntriesController.php
    @@ -254,10 +254,19 @@ class EntriesController extends BaseEntriesController
            // Enable Live Preview?
            if (!craft()->request->isMobileBrowser(true) && craft()->sections->isSectionTemplateValid($variables['section']))
            {
    +           $previewUrl = $variables['entry']->getUrl();
    +           $previewBase = getenv('PREVIEW_URL');
    +           if ($previewBase) {
    +               if ($previewUrl[0] === '/') {
    +                   $previewUrl = $previewBase . $previewUrl;
    +               } else {
    +                   $previewUrl = str_replace(getenv('CRAFT_URL'), $previewBase, $previewUrl);
    +               }
    +           }
                craft()->templates->includeJs('Craft.LivePreview.init('.JsonHelper::encode(array(
                    'fields'        => '#title-field, #fields > div > div > .field',
                    'extraFields'   => '#settings',
    -               'previewUrl'    => $variables['entry']->getUrl(),
    +               'previewUrl'    => $previewUrl,
                    'previewAction' => 'entries/previewEntry',
                    'previewParams' => array(
                                           'sectionId' => $variables['section']->id,

    This patch changes the domain of the live preview URLs passed to the admin panel, so that when a live preview is requested the POST request goes to the alternative domain.

The browser sends the auth cookie through thanks to the cookie domain being an ancestor of both sites, and the alternative domain allows the POST request due to its CORS headers. The alternative domain hits PHP with the request with serves the response. Any GET requests which come through to the alternative domain (maybe for assets and that kind of thing, or followed links to other pages) will get 301 responses and so be redirected to go through the cache.

I don't much like this solution.

narration-sd commented 7 years ago

(couldn't get out of here, realized it will have to be tomorrow)

Regards, Clive

tremby commented 7 years ago

I agree with everything you've said. I hope changing the endpoint for live previews is something which might eventually make it into Craft. I guess it's just a matter of making a decision about whether it's okay to break live previews for sites with relative asset URLs.

In the mean time, I wonder if there are any ideas about how I might alter my above solution so that no files from Craft itself are modified? Is there a way I can achieve the same thing by configuration and possibly extending classes rather than editing core files?

narration-sd commented 7 years ago

You know, I had another quick go at this and discovered Cloudfront rule matching is pretty unelaborate, no doubt for the efficiency - no regex-like abilities. A glance at CloudFlare doesn't show much better.

Reading once again your original problem statement, though, might you be able to solve by simply excluding any CP (admin) paths from caching?

At first glance this looks unsupported, as there isn't a NOT operator in path patterns, but then I stumbled across something which you can find on this page: Developer Guide

If you want to apply a different cache behavior to the files in the images/product1 directory than the files in the images and images/product2 directories, create a separate cache behavior for images/product1 and move that cache behavior to a position above (before) the cache behavior for the images directory.

So, could your solution be to encode a rule matching the Craft admin introducer, which doesn't cache, and put that rule before others that do?

...I am quite possibly being too simpleminded about this, and also this time jumping right out the door, but you can look at it...

tremby commented 7 years ago

That doesn't help. The POST requests for live preview pages are not to addresses under the /admin prefix. That's what this entire thread is about.

narration-sd commented 7 years ago

Hmm, yes, you did say that at the outset. I was in fact fishing around in the code at some rapid rate this afternoon and wanted to think it said something else, but what I saw makes more sense this way. I believe I might have misinterpreted your example because it looked like (though reversed of) the number-hyphen addition the CP puts on the entry slug when editing it; was thinking too much about that.

Very interested actually in what you might make of one of these paths, Bart.

p.s. just for completeness I went out looking for any equivalent services on CloudFlare. I didn't find any, though there are a number of initiatives for public domain or commercial Lambda alternatives on platforms, not so far along yet it seemed. However, CloudFlare has by now long had Railgun, a different take on active page caching, which I once tried to pretty nice results for a site needing European presence, potentially serving especially Russia. Using CloudFlare, you could go to the simple alternative uncached domain equivalent I early mentioned here, and as you actually use in your patch scheme above. Probably not attractive since you appear embedded with AWS, but any of these mentioned could possibly be worth a little keeping track of, no?

tremby commented 7 years ago

The Lambda@Edge idea is interesting, and I'll look into it. Thanks.

narration-sd commented 7 years ago

Good enough.

I had a flash this morning on why CloudFlare/Railgun didn't seem to have problems that you run into with the SessionId cookie. It's because it sends back only the delta for the cached item each time. This means you pay at least ttfb including latency for each page access, so it is not quite as effective as a cache, though they employ a pre-http/2-spdy persistent channel that speeds up any other delta.

Still, this flavor might interest, if you don't get somewhere positive with Lambda for example. If you wanted to try an experiment with it, Arcustech who runs Craft's own site and has a long good history with the community offers Railgun-included even on lowest-price hosting; you just have to ask for it.

tremby commented 7 years ago

I had a flash this morning on why CloudFlare/Railgun didn't seem to have problems that you run into with the SessionId cookie. It's because it sends back only the delta for the cached item each time.

I don't understand.

If you are sending a POST request for a live preview, and your CDN is filtering out the session ID cookie, you will get a redirection to the log in page. How does what you're describing get around that?

narration-sd commented 7 years ago

In my present view of how Railgun works, nothing is filtered out -- that was the basis of the flash.

SessionId, CSRF, and any other changed cookies would all come through as efficient deltas, with the minimal delay of one accelerated server access, then patched into the cached delivery of all the rest.

tremby commented 7 years ago

I looked up Railgun. I appears to be a compression scheme between the origin server and the CDN. It's not something we're interested in, but thanks.

narration-sd commented 7 years ago

Well, I think you'll find it's actually a cache -- overlaid with the accelerated delta of anything that does actually change on active pages. A careful read of first section here?

https://www.cloudflare.com/docs/railgun/intro.html

The normal cache operation occurs for non-active accesses like images, js, etc..

But anyway, of course you have to see what's good for your purpose...

tremby commented 7 years ago

I did read it. We want public pages which are essentially static (which is everything but admin pages) to be cached too. Requests for each should not hit the origin server more than once per hour or so. Even with Railgun the origin server is still fully processing each request and building the response, and then additionally the Railgun daemon is processing or diffing it in whatever way Cloudflare has devised and sending a smaller payload back to the CDN.

narration-sd commented 7 years ago

All true, and only you know how efficient you want to be.

It's just natural on long experience to question the degree necessary -- realities vs. ideal concept images -- but probably you and team are well aware of that also, as a 'disease', 'sometimes' occuring in software circles :)

narration-sd commented 7 years ago

Bart, you may well be tired of hearing from me on this, but something possibly interesting got pushed to my phone by Google Now over the weekend.

http://codingthesmartway.com/introduction-to-server-rendered-vue-js-apps-with-nuxt/

Given you could fit with Vue, nuxt-generate would produce a completely static image of your site at will.

Given your site is so largely static as you indicate, may be a way to quite simplify matters?

The developers look an interesting bunch, in any case.

Their doc site here is running this kind of static arrangement, and turns out they even use a Lambda script to cleanly deploy it.

narration-sd commented 7 years ago

As well, the Vuejs-nuxt discussion of async data suggests a couple of things:

Not delving into this, but quite possibly /action could be useful this way also for protecting any actions that would like to have session information...whether that comes up now or in the future.

tremby commented 6 years ago

The agency I'm contracting for launched a new site today, this time with Craft 3. It seems nothing changed on this topic between Craft 2 and 3 -- still no previews possible with the CDN setup we want.

So I had a go at making the Lambda@Edge function suggested above and have that working.

Cloudfront is set up like this:

This Lambda function simply removes any cookies from incoming HEAD and GET requests, and leaves others alone. Here it is:

exports.handler = async (event) => {
    // See https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-event-structure.html
    const request = event.Records[0].cf.request;

    if (['HEAD', 'GET'].includes(request.method)) {
        // These methods are cacheable and we don't ever take the cookies into
        // account. Remove them from the headers.
        delete request.headers.cookie;
    }

    // For other methods like POST we want to take cookies into account, since
    // they matter for page previews (and possibly in other cases).
    return request;
};

This has the effect of allowing cookies through for POST requests for the default ruleset, which is exactly what we need for previews. Cookies are removed from GET and HEAD requests, so they can be cached as they usually would be.

joshangell commented 6 years ago

@tremby this is basically what I tend to do when working with Varnish ... I only cache HEAD and GET then strip all cookies from most requests and just let action ones keep their cookies and bypass the cache.

I did want to try and use something more off the shelf like CloudFront but keep steering away given how finicky you have to get with all this! Good to see your solution.

brandonkelly commented 5 years ago

We’ve made checking for Live Preview requests in Cloudfront/CloudFlare much easier in Craft 3.1, as now Live Preview requests are sent with X-Craft-Token headers (a4e5c177955852e7beea75cc0c027acdb2ef48d1). So now you just need to check for that.