Closed malloxpb closed 6 years ago
The solution to this problem is making sure that all of the calls to the parsing functions in GURL (https://github.com/scrapy/scurl/blob/master/scurl/cgurl.pyx#L82) are successful (based on how they do it in gurl.cc, which can be seen here
GURL container will mark those urls such as
http:///
as invalid. However, since we are only using the Parsing functions from Chromium source (https://github.com/scrapy/scurl/blob/master/scurl/cgurl.pyx#L82), we haven't marked those urls as invalid. There might be some potential issues if we dont fix this :)