propublica / upton

A batteries-included framework for easy web-scraping. Just add CSS! (Or do more.)
MIT License
1.62k stars 113 forks source link

relative URLs #8

Closed jeremybmerrill closed 11 years ago

jeremybmerrill commented 11 years ago

issue reported by @danhillreports:

relative URLs aren't handled properly. if a relative URL (in an anchor's href property) is e.g. "/index.php", Upton will try to fetch "/index.php". Obviously, that won't work.

Fix is easy: detect relative urls and, if found, prepend the hostname.

dannguyen commented 11 years ago

I've addressed this in a pull request but think that ultimately, some refactoring of the API will have to be done https://github.com/propublica/upton/pull/14

jeremybmerrill commented 11 years ago

Fixed in 0.2.7 on rubygems.org with @dannguyen 's fix in #14 .