Open Anonyfox opened 9 years ago
when scraping the website http://www.bostonherald.com, the "http:" part gets stripped out. The site has
<link rel="canonical" href="//www.bostonherald.com/" /> in it's HTML, so the "//" symbol must be resolved to "http" by default.
<link rel="canonical" href="//www.bostonherald.com/" />
when scraping the website http://www.bostonherald.com, the "http:" part gets stripped out. The site has
<link rel="canonical" href="//www.bostonherald.com/" />
in it's HTML, so the "//" symbol must be resolved to "http" by default.