Metro-Records / la-metro-dashboard

An Airflow-based dashboard for LA Metro
4 stars 0 forks source link

Double daily scrape timeout, remove duplicate bill scrape #65

Closed hancush closed 3 years ago

hancush commented 3 years ago

Description

This pull request increases the timeout on the daily scrape and removes duplicate bill scraping by specifying to scrape people and events, which are not windowed by default, then scrape bills with a window of zero. (Running a bare pupa update also scrapes bills with a window of 3.)

Assuming this scrape runs, we should consult the log timestamps to see how long it actually takes and revise the window down.

Related to #64