jakartaee / rest

Jakarta RESTful Web Services
Other
364 stars 121 forks source link

JAX-RS treats URLs with and without a trailing slash as equivalent #468

Open glassfishrobot opened 10 years ago

glassfishrobot commented 10 years ago

In section 3.7.3 "Converting URI Templates to Regular Expressions" JAX-RS appends "(/.*)?" to the resulting regex, thus treating URLs with or without a trailing slash as equivalent.

This is a mistake. According to RFC 1808, section 4, step 6, the two URLs are not equivalent. As a result, relative paths resolve to a different URL depending on whether the URL contained a trailing slash or not. It is important to differentiate between the two and only match the address the user asks for. See http://stackoverflow.com/a/4765962/14731 for more information.

Ideally, users should be able to specify multiple @Path annotations (or one with multiple values) so that users who want to match URLs with or without a trailing slash can do so. Alternatively, allow users to specify a non-capturing regex so that UriBuilder doesn't treat it as a template parameter needing a value.

Affected Versions

[2.0]

glassfishrobot commented 6 years ago
glassfishrobot commented 10 years ago

@glassfishrobot Commented Reported by cowwoc

glassfishrobot commented 10 years ago

@glassfishrobot Commented cowwoc said: To clarify, JAX-RS is matching a resource for a URL without a trailing slash (against my wishes). The resource returns an HTML response entity. The HTML, CSS, JS code then resolve paths relative to this incorrect URL and result in broken links.

We can certainly come up with server-side workarounds, but the fundamental fact remains that the two URLs are not equivalent and users should be able to configure a resource to only match a URL that ends with a slash, failing to match otherwise. Furthermore, failing to match and matching and then returning HTTP 404 is not the same thing because in the latter case a subsequent servlet filter may decide to match a URL that Jersey did not consume.

glassfishrobot commented 10 years ago

@glassfishrobot Commented cowwoc said: One possible solution is to borrow the idea of regex non-capturing groups. (?:regex) matches a regular expression without defining a new group. Similarly, JAX-RS could allow {?:regex} to match a regular expression without defining a new template parameter. Users could then define precisely the regex that should be used, and JAX-RS would guarantee that the regex would not be modified (as is currently the case by appending (/.*)? to the end).

glassfishrobot commented 7 years ago

@glassfishrobot Commented This issue was imported from java.net JIRA JAX_RS_SPEC-463

cowwoc commented 6 years ago

My understanding is that resolving .. relative to http://www.example.com/a/b should return http://www.example.com/ whereas resolving .. relative to http://www.example.com/a/b/ should return http://www.example.com/a/.

Any web browsers running on top of JAX-RS servers will have problems using relative URIs given the current specification.

mkarg commented 6 years ago

@cowwon I doubt your latter assumption as it would be impossible to navigate to /a from /a/b.

cowwoc commented 6 years ago

@mkarg If I understand correctly, navigation relative to /a/b is actually the same as navigating relative to /a/. Meaning, if you are on /a/test.html then the relative path b/test.html will redirect you to /a/b/test.html.

By contrast, if you are on /a/test.html/ then the relative path b/test.html will redirect you to /a/test.html/b/test.html. I hope that makes sense.

cowwoc commented 5 years ago

I am attaching a testcase that proves the existence of this bug.

testcase.zip

  1. Unpack the testcase
  2. Run mvn install
  3. Run java -jar target/java -jar target\testcase-1.0-SNAPSHOT-jar-with-dependencies.jar
  4. Open a browser to http://localhost:8080/a/b and click on the "../" link
  5. Notice that the browser navigates from /a/b to /
  6. Open a browser to http://localhost:8080/a/b/ and click on the "../" link
  7. Notice that the browser navigates from /a/b to /a/

This proves that URIs that end with a slash are not equal to URIs that without the terminal slash. Thus, @Path annotations should treat these URIs differently and not automatically add or remote slashes on the user's behalf.

I await your feedback.

mkarg commented 5 years ago

Gili, awesome! Unfortunately I am really busy the next two weeks. Please don't be disappointed if my answer needs a while... :-)

cowwoc commented 5 years ago

No worries. I'll set a reminder to bug you in 2 weeks :) Good luck with your work!

spericas commented 5 years ago

@cowwoc If you're suggesting that the spec should distinguished URLs and paths with and without trailing slashes, I don't think it would be possible. This normalization step was introduced in version 1.0 and it is in the fabric of the matching algorithm as I remember it (and in the way path values are converted into regexp).

cowwoc commented 5 years ago

@spericas I am suggesting exactly that. I understand that this design mistake is woven into the core of the matching algorithm but you can't ignore the fact that URI handling is fundamental to REST APIs and the spec gets it wrong.

Just fix this in the next major version.

spericas commented 5 years ago

@spericas I am suggesting exactly that. I understand that this design mistake is woven into the core of the matching algorithm but you can't ignore the fact that URI handling is fundamental to REST APIs and the spec gets it wrong.

Just fix this in the next major version.

I think it was the correct decision at the time, the alternative introduces a lot of gotchas too when developing REST services. It is also not a backward compatible change and has the potential of breaking tons of existing applications.

cowwoc commented 5 years ago

@spericas Those "gotchas" are how URIs work :) Seriously... you are implementing the interface (specification) of a URI and your implementation breaks the contract. I've never disputed that this breaks backwards compatibility for users of JAX-RS but you should know that anyone who interfaces with JAX-RS servers from the outside will run into problems. JS clients transform URIs coming in/out of the server and you're breaking their compatibility.

As an analogous example: A few years ago I had to use an RS232 library to communicate with serial ports. They implemented InputStream but decided to return -1 if no bytes were available (yet!). This worked great, so long as you didn't mix their library with any other code. The minute you passed their InputStream implementation into any 3rd-party library, the -1 value was interpreted as end of stream which, according to the InputStream specification, is precisely what it means. This led to a whole slew of ugly bugs.

So yes, JAX-RS works perfectly in its current state with other JAX-RS technologies. And yes, changing this behavior will break JAX-RS servers. But your specification is already broken. Anyone mixing JAX-RS servers with non-JAX-RS HTTP clients will run into bugs. And there are a whole lot more non-JAX-RS clients out there than JAX-RS servers.

spericas commented 5 years ago

@cowwoc First, I don't think the situation is a bad as you describe it. People have been using JAX-RS with JS clients for years (certainly I have) and these technologies have been able to co-exist. Second, I'm not sure what you mean by "your specification", it is certainly not mine, and in fact I didn't even participate in the first version of it. Third, I find this discussion really academic at this point, I can't imagine the Jakarta community ever supporting a non-compatible change like the one you're describing, at least not for the 2.X work. Some non-compatible changes are planned for 3.0, perhaps this can be revisited then --but even then I'd be skeptical of such a change getting through.

cowwoc commented 5 years ago

Let's keep the door open for 3.0 then.

mkarg commented 5 years ago

As 3.0 definitively will break any existing application anyways, I think it is worth continuing the discussion.

bertysentry commented 5 years ago

+1 Got the same problem here. Client may request a URI with or without a slash, and it won't mean the same thing. My JAX-RS backend treats it the same way. It is annoying. But I guess we'll need to cope with that...

kaefer3000 commented 5 years ago

+1

I work a lot with RDF, where URI equality is crucial and determined using String equality. In case of relative URIs in responses that come from RESTful APIs, the URI of the resource determines the URI to resolve against, and is thus important to be "deterministic". If in JAX-RS, they may or may not come with a slash, this makes it hard to work with the API. I wrote a number of classes to deal with the situation, which add or remove slashes where I do (or do not) like trailing slashes, see here.

igor-kim commented 4 years ago

+1

In situation when client might request either "/api/somepath" and "/api/somepath/", which end up being processed by the same code with annotation @Path("/api/somepath"), this code needs to do some workarounds to distinguish between the calling URLs if relative URLs should be returned in response. One of the workarounds might be to inject @Context UriInfo uriInfo and check the request path: uriInfo.getRequestUri().getPath().endsWith("/") Which surely do the work, but looks ugly and confusing as @Path annotation states only one of the URL options (either with or without trailing slash). Adding another method with annotation for the second path option @Path("/api/somepath/") does not work, as only one of the methods becomes handler of the both paths "/api/somepath" and "/api/somepath/". I understand complications of breacking backward compatibility, but at least allow for routing of such paths to different methods by server configuration parameter or annotation option.

pdinc-oss commented 4 years ago

another patch we are going to have to make... sigh!

Link to internal bug: https://projects.pdinc.us/show_bug.cgi?id=2217

spericas commented 4 years ago

@pdinc-oss Links to internal issues that are not accessible to everyone here should NOT be included.

cowwoc commented 3 years ago

Unfortunately version 3.0 went out the door without this. Can we please schedule this against a concrete milestone?

spericas commented 3 years ago

@cowwoc Version 3.0 mentioned in this discussion is now 4.0. We had to change the major number for Jakarta EE 9.

cowwoc commented 3 years ago

Okay, thanks for the clarification. How can we move this issue forward? Does any design discussion need to happen?

chkal commented 3 years ago

I don't think that there is consensus yet if the current behavior should be changed or not.

Speaking only for myself: I also dislike that trailing slashes are currently ignored. However, I also think that changing this could cause more problems than it solves.

Example from my personal experience: Last week I reviewed some code written by a colleague. He basically defined a resource method like this @Path("/foo/bar"), but in the JavaScript frontend he sent requests to /foo/bar/. This worked fine, and he didn't even notice the inconsistent URLs. I noticed this, and we fixed this issue, but it is very easy to miss something like that. And if a newer spec version would change the behavior, it would cause major trouble in cases like this.

mkarg commented 3 years ago

To extends Christian's answers: I personally never differentiated URLs with-or-withot slash, mostly by mistake, was always was happy that "it just worked". But even if fixing this fault would mean that my existing applications will definitively break, I am definitively +1 for fixing it. In the end, it is my fault that I did it wrong in my app, so it is my job to fix it ASAP. So big +1 for fixing it.

mkarg commented 3 years ago

@cowwoc Would you like to lead this effort and provide a compulsory PR (including spec & JavaDoc changes)?

chkal commented 3 years ago

In the end, it is my fault that I did it wrong in my app, so it is my job to fix it ASAP.

I'm not sure if it is really a fault of the developer. Actually the spec allows to do it this way, and it will work fine with compliant implementations. The problem is that 95% of the developers don't actually know that trailing slashes are ignored by the matching algorithm and simply do it by mistake.

bertysentry commented 3 years ago

Couldn't the new (proper) behavior just be an option to enable?

cowwoc commented 3 years ago

@chkal

This worked fine, and he didn't even notice the inconsistent URLs. I noticed this, and we fixed this issue, but it is very easy to miss something like that. And if a newer spec version would change the behavior, it would cause major trouble in cases like this.

Whether JAX-RS treats this differently or not, the URI specification and all web browsers already do. Any relative URIs you construct will break. The fact that some people haven't run into this problem yet is a matter of pure luck.

@bertysentry

Couldn't the new (proper) behavior just be an option to enable?

If we introduce an option, I'd flip the default: introduce this change in a major version change, break backwards compatibility, and provide users with an option (off by default) that would restore the old behavior when enabled.

@mkarg Honestly, I'd prefer not to lead this effort directly. I care deeply about this change but I am already championing too many causes in my personal life and this would be one too many. I only have time to review proposals from time to time.

spericas commented 3 years ago

I second @chkal's experience as well. I suspect that is the reason why the decision was made in the first version. Unfortunately, I believe this change will create way more angry users than happy ones when their apps stop working.

mkarg commented 3 years ago

Hm... with 3.0 the apps stopped worked anyways.

spericas commented 3 years ago

Hm... with 3.0 the apps stopped worked anyways.

You mean with package renaming? That is easily automated via tooling, unlike what is being discussed here. If you mean the future 4.0, then I hope not all apps will be broken; moreover, what is being described here affects clients as well.

cowwoc commented 3 years ago

@spericas As I keep on mentioning, the house is already on fire. image. All apps are already broken when users attempt to use relative URIs in any web browser.

That said, one thing we can do is log a warning every time a client hits an endpoint with a different trailing slash than the server. Have the message explain that this behavior is broken, deprecated, and will be changing in version 4.x.

spericas commented 3 years ago

@cowwoc Respect your opinion, but don't share it (despite the funny pics!). It's definitely and interesting discussion, but hardly a showstopper.

Having said that, It would be interesting and an easier path for some implementations to try this out first.

mkarg commented 3 years ago

@jansupol @ronsigal @andymc12 @deki Are existing products already having a property to switch on this new behavior or would you be interested in adding one? In that case it would make sense to add a JAX-RS property to have a product-independent switch.

jansupol commented 3 years ago

I am +1 for having the option in Jersey, but it may be a significant change in the matching algorithm. We might find some cycles for it after Christmas, depending on other 3.1 requirements.

namedgraph commented 2 years ago

@jansupol any news on this? The current behavior doesn't make sense in terms of URI specifications and everything that builds on them. Looking up triples in an RDF database by the HTTP request URI (this is how Linked Data works) will return two completely different resources, given that it contains URIs with and without the trailing slash.

I wonder who thought this was a good idea in the first place... 🤦‍♂️

gregid commented 1 year ago

Has there been any progress on the trailing slash issue? For anyone working with RDF/LinkedData it is a huge deal and trying to downplay it is mind boggling.

cowwoc commented 1 year ago

This issue is almost 10 years old, and is the most up-voted and commented issue in this project... https://github.com/jakartaee/rest/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions-%2B1-desc

@mkarg What is needed to move this forward?

namedgraph commented 1 year ago

@jansupol ping

mkarg commented 1 year ago

@mkarg What is needed to move this forward?

The will of all committers to actively break backwards compatibility.

gregid commented 1 year ago

The other way of looking at it is a chance to finally remove technical debt (lack of alignment to standards).

cowwoc commented 1 year ago

@mkarg To clarify, we are asking to deprecate the current behavior and expose an optional new behavior that respects the URI standard. Then sometime in some future major revision, we delete the old behavior and make the new behavior the default.

namedgraph commented 1 year ago

Still no news on this?