frostschutz / MyBB-Google-SEO

Search Engine Optimization plugin for MyBB.
https://community.mybb.com/thread-202483.html
GNU General Public License v3.0
30 stars 25 forks source link

NGINX Proxy to Apache #59

Closed Sweets closed 6 years ago

Sweets commented 6 years ago

Some forum administrators, example being myself, may be working on an NGINX front-end that proxies to an Apache back-end (NGINX serving static content, Apache serving dynamic content).

Doing rewrites in NGINX can be a pain with this plugin (assuming that NGINX is proxying to Apache, that is. Otherwise, it's solid), so I've dug around, done some edits to enable this to be much easier to setup for an NGINX & Apache setup.

frostschutz commented 6 years ago

Can you describe your problem more clearly, and do you have Google SEO Redirect Debug info for me? (As admin, pass ?&google_seo_redirect=debug directly as parameter to a thread URL.)

From your changes it's hard to tell what you're trying to fix there. It seems like a workaround to a broken webserver configuration, which might be useful to you but not anyone else.

There is a known issue with redirects and proxies ( #48 #57 ). The redirect code was written in a time when https was still rare (no free letsencrypt certificates for everyone) and proxies were even rarer (no cloudflare everywhere) and I tried to do too much.

Sweets commented 6 years ago

It's not a bug, but rather the functionality is expected. In the master branch of the repository in its current condition, rewrites won't work properly for NGINX + Apache setups.

My fork adds functionality, so that the plugin works with an NGINX + Apache web-server configuration (both webservers running on a machine, NGINX proxying to Apache for dynamic content, i.e. php files).

I may be a bit vague in saying "NGINX + Apache". To elaborate, basically, the web-server handling the requests from users directly is NGINX. It serves content from a root directory, which is what users access. However, in the case of dynamic content, it will proxy to Apache, running on a different port, but the same server.

The reason that a server administrator would do this is that it gives the best of both NGINX and Apache. NGINX is great for performance and serving static content, but its downfall is in dynamic content, with fastcgi. This is where Apache shines, as it's better at serving dynamic content than NGINX.

Hopefully I explained everything in a way that is easy to understand.

frostschutz commented 6 years ago

I have the same setup - nginx front end, apache back end. What I don't understand is why you rewrite in nginx. If you're going to pass to apache anyway, apache can also do the rewrites, and then PHP sees the rewrite.

In your idea (rewrite already handled by the time apache sees the request) you basically have to disable google seo redirect entirely - that's not adding functionality but removing it. Do I have it wrong?

Do you have a link to the site you're currently using this on?

Sweets commented 6 years ago

I guess that is a way to look at, in regards to the comment of removing functionality. Though I'm still inclined to say otherwise.

I've got NGINX handling rewrites as it seemed more plausible to have the front-end handling rewrites. Also, no, I don't currently have a site that uses NGINX and proxies to Apache, this is on a localhost (it will, however, be used on my forum when I finish developing v2).

For some reason Apache doesn't play well with the rewrites provided by default, and again it made more sense in my mind to have NGINX do rewrites anyways. I should probably specify that I use URLs with slashes in them, which, without rewrites, looks like directories to the web-server in question.

e.g. thread/thread-name, profile/username, etc.

Though, I guess it would be highly specific functionality added, or from your prospective removed (and likely from another prospective), as it would only really apply to rewrites being done by NGINX.

frostschutz commented 6 years ago

Apache has no issue rewriting URLs with slashes in them (perhaps provided those directories don't actually exist.) The reason why slash URLs are problematic with MyBB is MyBB itself, it uses relative links everywhere and with a (virtual) directory structure those links are all relative to directory and so you end up at site/thread/thread/thread/subject instead of site/thread/subject. So that needs a lot of workarounds (see documentation) and even code changes ( see for example #43 )

Your change to url.php seems like you're trying to turn relative into absolute URLs there but that doesn't work well either, the function is supposed to return relative URLs, it will be used as such in places, so you might end up seeing http://yoursite/http://yoursite/some/place (doubled http://yoursite in some links).

Also, no, I don't currently have a site that uses NGINX and proxies to Apache, this is on a localhost

I was wondering how redirects still worked on your site, the way I read your code they shouldn't.

Simple test cases:

I don't think those redirects still work in your case (nginx says: I've already rewritten this, but nginx won't know if the URL was correct after all) - hence functionality removed. This could still be handled in another place (url.php knows if you gave it the correct subject or not, since it also updates them) but it's not done there, because redirect is supposed to be able to handle it in a more generic fashion.

Sweets commented 6 years ago

In the case of the provided test cases, none of them work. Suppose I'll just work on the rewrites with Apache.

Sweets commented 6 years ago

Something just occurred to me -- if you're doing rewrites via Apache, then how are you proxying to it? NGINX sees (thread/name) as a path, and won't see name as a php file.

My configuration matches locations ending with .php, and if the users location matches, it proxies to Apache. Otherwise, NGINX serves the content.

EDIT: Sorry for the "close" spam. Originally I closed the issue, but then reopened when I went back to going over my NGINX config. The next close was an accident.

frostschutz commented 6 years ago

if you're doing rewrites via Apache, then how are you proxying to it?

Unconditionally. (Sorry if you expected something sophisticated here.) Basically you can do something like try_files $uri @apache so nginx will still serve static files directly but proxy everything else (and *.php which still needs to be checked too, or you'll be serving source-code, database-password, etc.).

You kind of have to pass all (unknown) requests to Apache unconditionally, if you want Google SEO 404 to work as well. That's optional though, since you can configure nginx to do a nice 404 page by itself. Otherwise it will look like https://eternal.is/this-does-not-exist though.

I think most people have this largely unconditional proxy kind of setup, and nginx is mainly utilized to handle load balancing and such things, making sure the apache in the background does not suffocate.