scrapy / scrapely

A pure-python HTML screen-scraping library
1.86k stars 315 forks source link

Improve similarity algorithm to make usage of the extracted data per region #27

Closed kalessin closed 11 years ago

kalessin commented 11 years ago

Improve similarity algorithm to make usage of the extracted data per region in order to help the similar region algorithm to decide for a given region when multiple candidates are found, choosing the shortest option. The main need for this feature is the improvement of the extraction when we have a table with variable number of fields, each one in a different row.