Webdevdata / fetcher

Tool to download website data.
The Unlicense
9 stars 4 forks source link

On error, try https and/or prepending "www." to url; connect to urlhost, not url #3

Closed nwtn closed 11 years ago

nwtn commented 11 years ago

In a test I ran on the first 2000 sites, these modifications reduced generated errors from 151 with the original script to 80.

nwtn commented 11 years ago

Ah shoot, I accidentally included the other PR for issue #1 in here. Sorry. Feel free to merge this and close the other PR. Or, if you don't want that fix, let me know and I'll remove it from here.

yoavweiss commented 11 years ago

Sounds great!