yuankai / urlrewritefilter

Automatically exported from code.google.com/p/urlrewritefilter
Other
0 stars 0 forks source link

Rule matches URL unexpectedly #148

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
========================================================================
Consider the following rule (copied verbatim from our application):

<rule>
    <from>^/author/([\d\w]+)(\?.*)?$</from>
    <to last="true">/author?id=$1</to>
</rule>

I would expect /author/1 and /author/1?q=expected to be translated into 
/author?id=1.

I do not expect /author/1/article/5 to match this rule, because this URL does 
not have ? anywhere in it.

What is the expected output? What do you see instead?
========================================================================
URLs such as /author/{id} will match the rule, those such as 
/author/{id}/<something else>/<yet something else> will not match the rule.

However, /author/{id}/<something else>/<yet something else> also matches the 
rule and we get a malformed output URL.

What version of the product are you using? On what operating system?
========================================================================
Version: 4.0.4
OS: Windows 7 Professional 64-bit, RedHat Linux 2.6 64-bit, Ubunbtu 12.10 32-bit

Original issue reported on code.google.com by manish.i...@gmail.com on 12 Sep 2013 at 4:31

GoogleCodeExporter commented 9 years ago
Also consider the following Java code:

final Pattern authorPattern = Pattern.compile("^/author/([\\d\\w]+)(\\?.*)?$");

System.out.println(authorPattern.matcher("/author/1").matches());
System.out.println(authorPattern.matcher("/author/1?q=expected").matches());
System.out.println(authorPattern.matcher("/author/1/article/5").matches());

These sysout statements correctly print true, true and false respectively, as 
expected.  This is the same behaviour I would expect from URLRewriteFilter as 
well since it uses the Pattern class internally.

Original comment by manish.i...@gmail.com on 12 Sep 2013 at 4:36

GoogleCodeExporter commented 9 years ago
I have found something intriguing.  The full configuration file is given below:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE urlrewrite PUBLIC "-//tuckey.org//DTD UrlRewrite 4.0//EN" 
"http://www.tuckey.org/res/dtds/urlrewrite4.0.dtd">
<urlrewrite use-query-string="true">
    <rule>
        <from>^/author/([\d\w]+)(\?.*)?$</from>
        <to last="true">/author?id=$1</to>
    </rule>
    <rule>
        <from>^/author/([\d\w]+)/article/([\d\w]+)(\?.*)?$</from>
        <to last="true">/author/article?id=$1&ac=$2</to>
    </rule>
</urlrewrite>

Trying to debug through the class RuleChain, I see that the URL 
/author/1/article/9 does not match the first rule but matches the second rule.  
At this point I guess a forward must be performed to /author/article?id=1&ac=9 
since the default rule target behaviour is forward.

In the debugger I see that RuleChain is now invoked a second time, this time 
with the URL /author/article?id=1&ac=9 and then matches the first rule.  Here 
it gets converted into /author?id=article.

Is this behaviour expected?  Why is the rewritten URL sent through the chain 
again?

Original comment by manish.i...@gmail.com on 12 Sep 2013 at 5:03

GoogleCodeExporter commented 9 years ago
I am experiencing a similar problem. It seems like the same rule is processed 
again and again. The rule is used to match a country/language prefix in the URL 
(e.g. /gb/en/) and I'm expecting the (.*) part to be sent forward in the filter 
chain.

<rule>
    <from>^/([a-z]{2})/([a-z]{2})/(.*)$</from>
    <!-- set some session attributes here -->
    <to>%{context-path}/$3</to>
</rule>

This seems to work fine, but what I don't understand is why /gb/en/gb/en/test 
ends up being just /test when the request reaches the rest of the filter chain. 
If I change the (.*) to (.+) the logic seems to work correctly, but it doesn't 
really work the way I would assume.

It's been a while since this has been posted so manish, have you found any 
explanation why this happens?

Original comment by ristomat...@gmail.com on 28 Oct 2013 at 11:06

GoogleCodeExporter commented 9 years ago
I haven't looked through the source code in detail to get to the bottom of 
this, so I do not have a definite answer as of now.  We have stopped using the 
filter due to this problem as it has broken practically all of our 
functionality (although I would have liked to continue using it).

I am hoping to spend some time on this problem in the second half of November 
when my project commitments will reduce a little bit.  Will post my findings 
here if I come across anything useful.

Original comment by manish.i...@gmail.com on 30 Oct 2013 at 7:41

GoogleCodeExporter commented 9 years ago
We possibly have to keep using the filter as our project deadlines are closing 
up and this issue came up completely suddenly. It's too bad that the 
development of this filter seems to be discontinued as it's mostly working 
great and I don't know any other alternative that would work on the filter 
chain instead of Apache+modrewrite for example.

Fortunately this is not a deal breaker for our usage scenario but I would still 
be very interested to hear your findings if you end up pursuing the problem 
further. I personally most likely won't have time to find the root cause of 
this problem as we can most likely circumvent it. In any case, thank you for 
your time to comment on this issue!

Original comment by ristomat...@gmail.com on 30 Oct 2013 at 8:44