use the page URL or (if specified) the base URL from the HTML to resolve relative links
-avoid unnecessary calls of urllib.urljoin (might become expensive)
For one tested WAT file the number of image links is increased by 75% while 15% more CPU time are spent. I didn't yet look on the set of extracted URLs for duplicates, etc.
For one tested WAT file the number of image links is increased by 75% while 15% more CPU time are spent. I didn't yet look on the set of extracted URLs for duplicates, etc.