gWorldz / get-simple-cms

Automatically exported from code.google.com/p/get-simple-cms
GNU General Public License v3.0
0 stars 0 forks source link

New HTACCESS RewriteRule #307

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I would like to propose a new RewriteRule.

This new RewriteRule should help plugin authors when looking for specific 
things (eg. a short URL plugin) and in the future might allow for use the same 
page slug for several pages depending on their parent.

RewriteRule (^|/)(([^/]+)/)?([^/]+)/?$ index.php?parentid=$3&id=$4 [QSA,L]

Some comments on what this does:

1. (^|/) => The match should start at the beginning of the requested URL or at 
a /. The current used /? means starting can start in the middle of the 
requested path. That is if I request /==a it currently matches "a" as the page 
ID instead of "==a" because it just starts whenever [A-Za-z0-9_-] starts to 
match something.

2. (([^/]+)/)? => This matches a single parent folder in the URL, without its 
trailing /. This can contain any symbols instead of just [A-Za-z0-9_-]. If a 
longer URL like /1/2/3/4/5/page is given it matches the most-right one (5).

3. ([^/]+)/?$ => Just like the original matcher, this matches any symbol at the 
end of the URL but excluding the possible trailing slash.

A nice extra with this is that every single non-existing request is passed to 
GetSimple, giving the user the ability to display their 404 page. Currently a 
request like /==a is not passed to GetSimple and will show the server’s 
default 404.

Please discuss, this is a pretty big must-have to write a good URL shortener 
;-) Something to think about it wether we might want to opt for a more 
WordPress like system and just pass everything along to the PHP. Or if you want 
a little more structure (ie. keep the $_GET['id'] separate) we could do 
something like:

RewriteRule (^|/)(([^/]+/)+)?([^/]+)/?$ index.php?path=$2&id=$4 [QSA,L]

Original issue reported on code.google.com by martijn.personal@gmail.com on 24 Mar 2012 at 12:02

GoogleCodeExporter commented 9 years ago
i think the only thing I am concerned with is what happens to existing 
installations if they upgrade to 3.2 when this is the new .htaccess file? Will 
it break anything?

Original comment by ccagle8 on 24 Mar 2012 at 12:38

GoogleCodeExporter commented 9 years ago
You can drop this new RewriteRule in your current 3.1 right now without 
breaking anything. At least, I haven’t found any problems. That’s why I 
kept the $_GET['id'] in there, this means we don’t actually have to change 
any of the code.

If you want to reuse slugs per parent we will have to use a new filename 
pattern for the XML files. Maybe something like parent.slug.xml. This would 
break old XML files, yes. I think we might need something like an upgrade.php 
file as seen in many other CMS systems. This PHP file would run when upgrading 
between versions to fix database differences between the 2, in our case 
filename differences.

Also, the last one that includes the full path might be good for plugins like 
I18N. Instead of having to ask the user to change their HTACCESS file (which 
leads to problems: http://get-simple.info/forum/post/24803/#p24803) the plugin 
can just look for the first part of the $_GET['path'] and match it against the 
languages that are in use.

/de/whatever/parent/page would turn into
• $_GET['id'] == "page", so current GetSimple code still works.
• $_GET['path'] == "whatever/parent/".
I18N could then just match "whatever" against a list.

Original comment by martijn.personal@gmail.com on 24 Mar 2012 at 12:49

GoogleCodeExporter commented 9 years ago
That last part should be:

/de/whatever/parent/page would turn into
• $_GET['id'] == "page", so current GetSimple code still works.
• $_GET['path'] == "de/whatever/parent/".
I18N could then just take the first part of the path as language ("de") and 
build the XML file name from there: "page"."_"."de".".xml".

Original comment by martijn.personal@gmail.com on 24 Mar 2012 at 1:01

GoogleCodeExporter commented 9 years ago
I put this up for discussion and testing on the forum: 
http://get-simple.info/forum/topic/3684/help-test-a-possible-future-change/. It 
needs a better topic title, so change it if you have a better idea.

Updated rewrite rule:

RewriteRule ^((([^/]+/)*[^/]+)/)?([^/]+)/?$ index.php?path=$2&id=$4 [QSA,L]

Why so complex? This will allow for a simple explode() on the PHP side, no more 
PHP parsing needed. Keeping it simple and easy to adept for plugin developers. 
I made a note about that on the forum as well.

Original comment by martijn.personal@gmail.com on 24 Mar 2012 at 4:05

GoogleCodeExporter commented 9 years ago
Been away for a few days so only looking at this now. 

Looks good but have some reservations about all these changes to .htaccess 
files which have a knock on effect on us users who don't use apache and have to 
rewrite these changes for our systems, have you seen Zeus rewrite scripts?? 8) 

So why don't we pass the whole URI to PHP and do our parsing there? 
Surely better to have GS working out of the box on as many host system as 
possible. 

As for slug name what if there is a duplicate we name it as above. 

parent.slug.xml 

so if we looked for 

/de/blogs/blogentry1

which corresponds to blogs.blogentry1_de.xml 

/blogs/blogentry1

corresponds to the default language version of the file blogs/blogentry1.xml 

If the page is moved up a level or between parent we will need to code in 
changing the slug name. 

So we can check the first param against an array of ISO 639 language codes and 
traverse backwards through the rest 
checking the parent each time. 

If the parent does not corresponds we divert to a 404 page. 

Original comment by MichaelS...@gmail.com on 26 Mar 2012 at 10:47

GoogleCodeExporter commented 9 years ago
> So why don't we pass the whole URI to PHP and do our parsing there?
> Surely better to have GS working out of the box on as many host system as 
possible.

I agree, as I said at the end of the original issue:

> Something to think about [is] w[h]ether we might want to opt for a more 
WordPress like system and just pass everything along to the PHP.

I just tried to support the ‘old way’ in the .htaccess as well, this way 
we’ll be catering for plugin authors already, even before we would have had a 
need to rewrite the whole URL parsing. It will also keep old plugins working 
that might depend on the $_GET['id'] variable.

RewriteRule ^((.+/)*(.+)/?)$ index.php?path=$1&id=$3 [QSA,L]

Would put the complete path, including the possible trailing slash and the 
value of $_GET['id'] into $_GET['path'] while still supporting putting the last 
part of the request in $_GET['id'] as well. (Untested, I’m at school.) Would 
that be better?

Original comment by martijn.personal@gmail.com on 26 Mar 2012 at 12:14

GoogleCodeExporter commented 9 years ago
New Regex, again, the previous untested one included the trailing slash in 
$_GET['id']:

RewriteRule ^((.+/)?(.+?)/?)$ index.php?path=$1&id=$3 [QSA,L]

Original comment by martijn.personal@gmail.com on 26 Mar 2012 at 4:53