bramus / mixed-content-scan

Scan your HTTPS-enabled website for Mixed Content
MIT License
522 stars 51 forks source link

URLs with leading white space cause scanning problems #64

Open ghost opened 7 years ago

ghost commented 7 years ago

When a page contains a URL with some leading white space, e.g. <a href=" http://abc.com">, the code assumes that it is a relative link and prepends the URL of the page of the site being scanned, so the URL gets queued as https://mysite.com/ http://abc.com. I fixed this by adding:

// trim white space
        $linkedUrl      = trim($linkedUrl);

at the start of the private function absolutizeUrl($linkedUrl, $currentPageUrl)

This seems to fix the problem, but I can't claim to be really familiar with the code, so I'd welcome any review before being incorporated.