99designs / phumbor

A minimal PHP helper for Thumbor
MIT License
123 stars 43 forks source link

Url `rawurlencode` the original url when it shouldn't #10

Closed masom closed 10 years ago

masom commented 10 years ago

The Url class automatically encodes the original url when it chose ideally leave that step to the caller.

Example from the Thumbor docs: https://github.com/globocom/thumbor/wiki/Getting-started

http://localhost:8888/unsafe/300x200/http://www.waterfalls.hamilton.ca/images/Waterfall_Collage_home_sm1.jpg

The url itself isn't encoded.

What was the rationale to rawurlencode the original file path?

harto commented 10 years ago

I found that unencoded URLs were rejected by the Thumbor service as 404s. Encoding the URL fixed the issue for me. Is encoding the URL causing issues for you?

masom commented 10 years ago

We are facing issues with double encoding of the urls from services such as Google, Facebook and Pinterest.

harto commented 10 years ago

The URL signature changes according to whether the URL is escaped. Consider the following:

Escaped:

/DRkNWh9gx1LznxGesn85hhs9Dmc=/fit-in/500x370/http%3A%2F%2Fd8u1nmttd4enu.cloudfront.net%2Fdesigns%2F32962772~09d9cf6543ea5ba5f87fec66a7d140971c47048d-original

Unescaped:

/FSn9dN-hVNxPs00m_nwkW88tCWM=/fit-in/500x370/http://d8u1nmttd4enu.cloudfront.net/designs/32962772~09d9cf6543ea5ba5f87fec66a7d140971c47048d-original

I'm not sure how we could defer the option of encoding to the caller. Can you give me an example of the double-encoding?

masom commented 10 years ago

The Facebook image proxy displays a blank image: https://fbexternal-a.akamaihd.net/app_full_proxy.php?app=787809261232980&v=1&size=z&cksum=c3cdc744176bbab43ab59c093c77729c&src=http%3A%2F%2Fd28d6vqymyqaiq.cloudfront.net%2FJb86INDi2N3LXDecypZpsNxitSY%3D%2Ffit-in%2F256x256%2F%2524crated%2524%2F2014%2F01%2F15%2Fuploads%2F67ede8776963ee0d215808d09b83853f%2F960.jpg

Whereas the image url works: http://d28d6vqymyqaiq.cloudfront.net/Jb86INDi2N3LXDecypZpsNxitSY=/fit-in/256x256/%24crated%24%2F2014%2F01%2F15%2Fuploads%2F67ede8776963ee0d215808d09b83853f%2F960.jpg

harto commented 10 years ago

I don't see how this can be fixed in this client library. As far as I can tell, the original URL must be encoded, or Thumbor won't be able to find the image. I think it needs fixing upstream in Thumbor.

harto commented 10 years ago

OK -- I've investigated this a little further.

The example URL I posted above seems problematic somehow. If I don't sign the URL, everything works as expected, e.g.

/unsafe/500x370/http://d8u1nmttd4enu.cloudfront.net/designs/32962772~09d9cf6543ea5ba5f87fec66a7d140971c47048d-original

At 99designs our image URLs normally contain a tilde character (~). I think this is the reason our URLs don't work unescaped. For example, here's another problematic URL:

/7_YM1dA7tbWk0d7p7zOhS3n6xXQ=/320x240/http://dev.harto.org/~hello/mugshot.png
harto commented 10 years ago

I should say I'm testing against Thumbor 3.10.2. Perhaps this isn't the behaviour in more recent Thumbors. I'm doing some more investigating.

harto commented 10 years ago

Yep, the tilde is the issue. There are two possible solutions (discussed here: https://github.com/thumbor/thumbor/issues/289). It looks like you're correct, we shouldn't be encoding urls in this library.

masom commented 10 years ago

@harto thanks for the thorough review and handling of this bug.

We also noticed thumbor marking valid urls as "malformed".

harto commented 10 years ago

Latest Thumbor can handle tildes per https://github.com/thumbor/thumbor/commit/1693d682e26716fa1fb552cc3e24e29ae4b80e4f. I think I'll drop URL encoding from Phumbor, and leave it to the caller per your suggestion.

estahn commented 10 years ago

:+1:

We had an issue with @ as a resolution identifier when i removed rawurlencode. But this shouldn't be a show stopper.

harto commented 10 years ago

Sorry for the delay on this. I've removed the rawurlencode call. As this is a potentially breaking change, I bumped the major version number from 0 -> 1.

Note: if you update to this version, your thumbnail URLs will change. If you use the URL as a cache key (e.g. in a CDN), be aware that this will invalidate your caches.

estahn commented 10 years ago

Awesome Stuart, thanks.

masom commented 10 years ago

Thanks guys!