Decoding functionality for Escaper

GoogleCodeExporter commented 9 years ago

Please add functionality that reverses the effects of Escaper.escape(String).

Use-case: When converting from URL to URI, we need to decode the URL components 
before passing them into the URI constructor to be re-encoded. See 
http://stackoverflow.com/a/22279061/14731 for an example.

Original issue reported on code.google.com by cow...@bbs.darktech.org on 9 Mar 2014 at 6:20

GoogleCodeExporter commented 9 years ago

Is the use case served by the JDK URLDecoder method used in the StackOverflow 
answer?

Original comment by cpov...@google.com on 11 Mar 2014 at 7:11

GoogleCodeExporter commented 9 years ago

I have a feeling I know what's going to happen once I answer this... :)

Yes, the use-case is served by the JDK URLDecoder method. I don't have an 
immediate use-case for decoding the other Escaper implementations but perhaps 
others do...?

Original comment by cow...@bbs.darktech.org on 11 Mar 2014 at 10:24

GoogleCodeExporter commented 9 years ago

> I have a feeling I know what's going to happen once I answer this... :)

:)

> Yes, the use-case is served by the JDK URLDecoder method. I don't have an
> immediate use-case for decoding the other Escaper implementations but
> perhaps others do...?

Our general thinking here is that a proper URL class ought to handle unescaping 
as part of parsing, similar to what parsers do for HTML, etc. Such a URL class 
has been on our to-do list forever. See issue 1005.

Original comment by cpov...@google.com on 12 Mar 2014 at 4:02

Added labels: Package-Escape, Type-Addition

GoogleCodeExporter commented 9 years ago

Jersey's UriBuilder does an excellent job in that department, but there are 
still a few use-cases it fails to handle nicely.

Imitation is the highest form of flattery :) Have you considered "extending" 
their UriBuilder design?

Original comment by cow...@bbs.darktech.org on 12 Mar 2014 at 11:47

GoogleCodeExporter commented 9 years ago

Would that be this one? 
https://jersey.java.net/apidocs/2.0/jersey/javax/ws/rs/core/UriBuilder.html

(In this case, we do have one internally. The problem is that it has a flaw or 
two. We'd like to fix it up rather than introduce a new one, but the impact on 
existing users (who might rely on, e.g., the exact hash generated by a 
UriBuilder usage) makes the fix difficult.)

Original comment by cpov...@google.com on 13 Mar 2014 at 3:08

GoogleCodeExporter commented 9 years ago

Yes, that's the one.

"We do have one internally": What do you mean internally? Internally at Google?

"who might rely on, e.g., the exact hash generated by a UriBuilder usage": 
since when does the specification guarantee the exact hash generated? Isn't it 
a code smell if users need to depend on this kind of (implementation-specific) 
information?

Original comment by cow...@bbs.darktech.org on 13 Mar 2014 at 4:15

GoogleCodeExporter commented 9 years ago

Yeah, sorry, internally at Google.

And the hash they're relying on isn't hashCode(); it's the output of some well 
defined hashing algorithm that they run against the URL string. Consequently, 
if we change the URL string by improving our class's escaping behavior, the 
hash will change, and their code will break.

Original comment by cpov...@google.com on 13 Mar 2014 at 1:00

GoogleCodeExporter commented 9 years ago

Couldn't you release the fixed version to the public, and apply a 
"normalization" process inside Google before passing it on to the hashing 
algorithm?

Meaning, I would expect you to retain backwards compatibility by modifying the 
wrapper code (the code that invokes the UriBuilder and passes the result to a 
hashing algorithm), as opposed to holding UriBuilder's implementation back.

Are you saying there isn't a deterministic way to convert from the "fixed" 
format back to the format expected by the hashing function?

Original comment by cow...@bbs.darktech.org on 13 Mar 2014 at 6:08

GoogleCodeExporter commented 9 years ago

Something like that should work. The challenge is in identifying all the lines 
of Google code that depend on the exact URL. This can be more difficult than 
I've made it out to be: Some callers might build a URL and send it via RPC to 
an entirely different service, and only in that different service is the URL 
hashed.

Original comment by cpov...@google.com on 13 Mar 2014 at 6:14

GoogleCodeExporter commented 9 years ago

If you normalize it as the source before sending it out via RPC you should be 
safe. Meaning: Find Usages on your UriBuilder class and always normalize the 
result regardless of where it goes. You could always optimize this later by 
removing normalization where it's not necessary.

Original comment by cow...@bbs.darktech.org on 13 Mar 2014 at 6:29

GoogleCodeExporter commented 9 years ago

We'd be talking a 5-digit number of callers here. I'd additionally worry that 
new users within an app might omit the normalization call. We'd almost need to 
force users to choose which escaping policy they want.

I'm beginning to wonder if we should just release a UriBuilder that has the 
correct behavior in Guava and leave the one inside Google with the old behavior.

Original comment by cpov...@google.com on 13 Mar 2014 at 6:33

GoogleCodeExporter commented 9 years ago

How about having the Google UriBuilder wrap/extend the Guava UriBuilder 
implementation? This way your code keeps on running without changing any 
references, and Guava users still benefit.

Original comment by cow...@bbs.darktech.org on 13 Mar 2014 at 7:36

GoogleCodeExporter commented 9 years ago

That could work.

Original comment by cpov...@google.com on 13 Mar 2014 at 7:44

GoogleCodeExporter commented 9 years ago

This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:09

Added labels: MigratedToGitHub

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:17

Changed state: Migrated

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:07

Added labels: Restrict-AddIssueComment-Commit

DaveAKing / guava-libraries

Decoding functionality for Escaper #1691