USStateDept / search

1 stars 0 forks source link

Refactor for custom scraper settings #559

Closed captjt closed 7 years ago

captjt commented 7 years ago

Premise of Refactor

Prototype Crawler (ETL) is working but because we want to have an easy way to reproduce these custom crawlers and scrapers ...

... we need to modularize the code to be reusable packages that we can customize to our wishes

captjt commented 7 years ago

Customizable Settings

Scraping custom elements

Creating a modular way to do custom scraping through settings is a bigger lift will have to have specific ways we want to drill into html nodes.

captjt commented 7 years ago

Benefits

Bi-product will be multiple package files and an API to use ... super simple and super powerful

captjt commented 7 years ago

From 01-09

captjt commented 7 years ago

From 02-09

captjt commented 7 years ago

From 02-15