scrapy / scrapely

A pure-python HTML screen-scraping library
1.86k stars 315 forks source link

What you mean with "The training implementation is currently very simple and is only provided for references purposes, to make it easier to test Scrapely and play with it. " #54

Open bitliner opened 10 years ago

bitliner commented 10 years ago

May you specify more in details the meaning of the sentence in the README.md

The training implementation is currently very simple and is only provided for references purposes, to make it easier to test Scrapely and play with it. ...you should use train() with caution and make sure it annotates the area of the page you intended

Which are the problems that may come out from using the trainining implementation?

A mismatch of encoding between the data provided as input and the encoding of the html pages? Others?

If you can make a list of all the known problems I may help with the development of one of them