scrapinghub / portia

Visual scraping for Scrapy
BSD 3-Clause "New" or "Revised" License
9.3k stars 1.4k forks source link

Handle required fields from schemas in item processor #633

Closed ruairif closed 8 years ago

ruairif commented 8 years ago
Handle required fields from schemas in item processor

Extract CSS and XPath in the SlybotIBLExtractor
Run item validation in the SlybotIBLExtractor

By running validation and selector extractors within the IBLE if no valid items
are extracted using the current sample another one can be tried. This approach
also allows for Container extractors to have nested Containers where not all
required fields need to be present in the primary Container.