postmodern / spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
MIT License
805 stars 109 forks source link

Respect base tags #58

Open ericmason opened 7 years ago

ericmason commented 7 years ago

Currently <base href="..."> tags are not taken into account and will send the spider to the wrong URL on pages with a base tag. With this patch, the spider correctly calculates absolute URLs when a base tag is present.