marchtea / scrapy_doc_chs

scrapy中文翻译文档
1.11k stars 410 forks source link

在“Following links” 这一节有个小错误 #76

Open pchjia opened 8 years ago

pchjia commented 8 years ago

文档的response.urljoin有两个参数, 但是第一个参数是Response类的引用, 不能在类外使用,查看文档后得出此处的正确写法为response.urljoin(href.extract())

以下是文章内容引用:

def parse(self, response):
for href in response.css("ul.directory.dir-col > li > a::attr('href')"): url = response.urljoin(response.url, href.extract()) yield scrapy.Request(url, callback=self.parse_dir_contents)

class Response(object_ref): def urljoin(self, url): """Join this Response's url with a possible relative url to form an absolute interpretation of the latter.""" return urljoin(self.url, url)