gma / nesta

File Based CMS and Static Site Generator
http://nestacms.com
MIT License
902 stars 121 forks source link

Links to assets on same host should not include hostname #103

Closed gma closed 10 years ago

gma commented 12 years ago

If you run Nesta behind a proxy server (such as nginx) with caching enabled, the files that get cached to disk contain absolute links such as http://127.0.0.1/css/master.css. They ought to just show /css/master.css.

I imagine the same problem applies for links made to pages.

I strongly suspect we introduced this bug in #94, with the url helper from Sinatra.

/cc @ms

gma commented 12 years ago

Just doing a quick brain dump to capture my current thoughts on what we should do here.

  1. Sinatra's url method is returning absolute URLs (see #93 and #94).
  2. We could use the url method to return relative paths by passing false as the second argument, which we would have to work out relative to our mount point (in order to satisfy the requirements in #93). I worry that the use of url in the code (even if we always pass absolute = false) would lead to this problem re-occuring in future when somebody uses url without realising the danger of absolute URLs.
  3. When you embed Nesta in the same process as Rails, and get them to share their views, you find that the url method isn't defined. If you include Sinatra::Helpers in your Rails helper class (which makes url available to the view) you then run into trouble with Sinatra and Rails extending Rack::Request in different ways, and url relies up on Sinatra's extensions. You can work around this, but it's hacky.
  4. Given (2) and (3), I think we should stop calling url altogether, and write our own path_to helper. It'll look a lot like the old url_for, but will know how to cope with the path that we're mounted on.
  5. We need to wrap some tests round this to prevent things from going pear shaped.

@MicahChalmer, @ms – Thoughts?

MicahChalmer commented 12 years ago

Makes sense to me. Seems a shame to have to re-implement what's already there in Sinatra, but given the conflict with Rails, it looks like it's needed it to make the Rails embedding work.

ms commented 12 years ago

Sorry about the delay in feedback. I've been pretty busy.

I actually hadn't read this bug report when I realized that the last version behaved like this. I downgraded to 0.9.11 on my production server.

I think that even if url([path], false) we'd run into problems if Rack doesn't know that nginx has it under a subpath (like example.com/blog/). Is that correct? Potentially if we set up both nginx to redirect /blog/ to Rack and Rack as only accepting /blog/, it could work? I'm not entirely clear on that.

I am actually surprised I didn't catch that. I'll try any future changes I make on prod before asking you to merge code. It would be really nice to have smart tests to check that all the assumptions we make about getting the correct url work in different scenarios.

About point 3, I'm not really familiar with embedding Nesta in a Rails process. Doesn't Nesta need some Sinatra code to run? If that's the case, wouldn't the helpers be imported no matter what?

Looking simply at 1 and 2, the solution seems to be to define our own relative_url that simply calls url and passes it false, together with tests to make sure nobody uses url incorrectly in the future. But 3 seems to make this impractical.

Finally for 4, what type of information would we need to deal with paths? I haven't read the source code of Sinatra's url, but would we simply copy their code?

gma commented 12 years ago

@ms - I think that the key to working in as many scenarios as possible will be always using Rack to mount Nesta at a non-root path. Nginx et al will then just be able to send requests straight through and let Rack deal with any relative path stuff.

I appreciate the thought about trying stuff in production, but there are too many scenarios to catch everything. Really, we need decent integration tests, and I'm not going to get into that while the tests are still in RSpec. I'm switching them to MiniTest, after which point they'll be easier to tidy up. Rack::Test will be very useful in making sure this path stuff works in different scenarios.

To put Nesta in Rails, all you need is for Nesta's own helpers to work in Rails. I've got a Rails view helper class that deals with that; it includes Nesta::View::Helpers and then overrides one or two of the default helpers. Aside from the url method there was no need to include Sinatra::Helpers in the Rails helper, which was really nice.

The new url helper will be easy to write; we can mimic Sinatra's, and remove the API that they added to Rack::Request that won't work in Rails (it's a pointless addition). Then we can just guarantee that it makes is relative paths, and we're good…

MicahChalmer commented 12 years ago

One question about this: the atom feed and sitemap need absolute URLs, and were using absolute URLs (obtained via Rack::Request#host, which is the same thing that Sinatra's url method uses) even before #94 was committed. Did they not have the same caching/hostname issue described in the original issue here? If they did, what should our new code do about it?

Or is this a moot point in light of #109 and the impending removal of the cache code altogether, and this issue only open to get rid of the Sinatra::Helpers dependency?

gma commented 12 years ago

One question about this: the atom feed and sitemap need absolute URLs, and were using absolute URLs (obtained via Rack::Request#host, which is the same thing that Sinatra's url method uses) even before #94 was committed. Did they not have the same caching/hostname issue described in the original issue here?

Good question – I'll investigate…

gma commented 12 years ago

Okay, yes it is (was) a problem. I just setup 0.9.11 behind Nginx on my desktop machine and it hard codes http://localhost:9393 into the link elements in the Atom feed. This is a problem! I'll have a think.

gma commented 12 years ago

Okay, I've been doing some reading. Anybody who is running Nginx in front of a Ruby web server that runs Nesta should be able to configure Nginx to cope with this problem themselves.

In Nginx you need to set the HTTP_HOST header adjacent to the proxy_pass directive. This config will forward the value of HTTP_HOST that was set in the request received by Nginx through to the back end web server:

proxy_pass http://localhost:5000;
proxy_set_header Host $http_host;

See http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_set_header

I presume other proxies will support the same thing.

What this means for this patch is that it's okay to render absolute URLs (but I still think we should only do it when we really need to, such as in an Atom feed).

MicahChalmer commented 12 years ago

Cool. That's what my pull request #110 does--full URLs in the atom feed and sitemap, but just paths everywhere else.

gma commented 12 years ago

Aye. It's what Nesta did originally too; I think we're good. I'm going to merge it soon, and then update the API a bit so that path_to has a partner that does the full URL stuff. Then (after a bit of testing) we'll be looking at 0.10.0.

gma commented 10 years ago

Coming back to this with fresh eyes, I'm going to close this issue. I don't consider current behaviour to be a serious bug (given my previous comment about how to configure a reverse proxy to control what the URI is set to, and our code in #110).