scrapy / scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.
https://scrapy.org
BSD 3-Clause "New" or "Revised" License
51.16k stars 10.35k forks source link

LxmlLinkExtractor unique_list missing key #3273

Closed nikan1996 closed 4 weeks ago

nikan1996 commented 5 years ago

https://github.com/scrapy/scrapy/blob/b364d27247b2d9b86c164569c7e0459fa3f8391b/scrapy/linkextractors/lxmlhtml.py#L130 I think it should behave like https://github.com/scrapy/scrapy/blob/b364d27247b2d9b86c164569c7e0459fa3f8391b/scrapy/linkextractors/lxmlhtml.py#L91 unique the link by its url.