Closed shaneaevans closed 12 years ago
Have you checked the scrapely command line tool (python -m scrapely.tool) keeps working after this change?
The change is API compatible, unless it relies on private functions, it should be fine. I checked some basic usage and it was OK. (although, really, this should be automated..)
I guess we should also 'fix' the tool. It requires users to tell it the encoding or it assumes utf8, where it should work out the encoding instead for the default case. I'll work up a patch..
I also note that the example on the README is broken - a 0 Scrapy project -n 1 -f author
doesn't work for me
The Scraper class can be trained with an HtmlPage instead of requiring a URL. It's more correct now (handling encoding, headers, etc.) when creating the HtmlPage for training.
The InstanceBasedLearningExtractor is no longer re-initialized on each request, improving performance.
A failing test has been fixed and now does not require to make an HTTP request to perform the test.