Closed ghost closed 11 years ago
you can use the train_from_htmlpage()
and scrape_page()
methods that receive a HtmlPage
object. Then it's just a matter of converting a Scrapy Response
object to a HtmlPage
, which is pretty easy. Look at how slybot does it, for example: https://github.com/scrapy/slybot/blob/master/slybot/spider.py#L230
Btw, the scrapely mail list would be more appropriate for these support questions.
Instead of url, is it possible to do
def parse(self,response): s.train(response.body,encoding='iso-885901')
instead of making scraply fetch things manually from url or local file.