nicolassmith / urlevaluator

URL evaluator for android - lengthens shortened URLs for correct handling in android
https://play.google.com/store/apps/details?id=com.github.nicolassmith.urlevaluator
Other
17 stars 3 forks source link

What to do about redirection to mobile sites #29

Closed nicolassmith closed 10 years ago

nicolassmith commented 10 years ago

The problem is as follows:

  1. tinyurl (and potentially others) always has more than one redirect (first a 301 redirect, and then a javascript redirect) before going to the intended destination.
  2. Thus I made the MultipleRedirectEvaluatorTask class, so that we only return the url once we go all the way down the rabbit hole.
  3. But some websites (yelp.com) have a redirect (of type 303) that point to the mobile site, which is not the real intended destination.

Possible solutions:

  1. Fake the user agent (what does the android HttpConnection use?) to hide that we are a mobile device, thus preventing mobile redirect. However, when I tried this, this broke t.co links (they responded with 200 OK and no redirect, I still don't understand why this happened).
  2. Don't use MultipleRedirect at all, just let an intent get thrown at each step of the way. Slightly ugly and annoying, but possibly more robust.
  3. Only follow certain types of redirects (ie only 301, not 303) this is the current solution, but what happens if there is a redirector service that does use 303? It seems that some (youtu.be) use 302. Are we safe to redirect on everything except 303?
  4. In the case of tinyurl, we know that there are 2 redirects, should we just keep track of how many redirects we expect, and only go to that depth before returning? This seems robust but requires more maintenance.
nicolassmith commented 10 years ago

I am curious about @duetosymmetry 's thoughts.

nicolassmith commented 10 years ago

The MultipleRedirectEvaluatorTask should just check that we actually want to handle a url redirect before attempting to do so using packagemanager:

http://stackoverflow.com/a/6529160/697611

That way, when we get the url for www.yelp.com, we see that we don't handle that host, and just return the URL, without sending to another redirection task.