We have too much URL formats in the crawler. Sometimes port is defined, sometimes not, sometimes we receive an object, sometimes a string… This patch tries to clean this up. Also, it fixes an issue with the crawler that assumes 80 is always the default port while it is 443 for HTTPS.
We have too much URL formats in the crawler. Sometimes port is defined, sometimes not, sometimes we receive an object, sometimes a string… This patch tries to clean this up. Also, it fixes an issue with the crawler that assumes 80 is always the default port while it is 443 for HTTPS.