Lever-age / data-pipeline

Data scraping and cleaning for the Leverage project
GNU Affero General Public License v3.0
0 stars 2 forks source link

Copy Data from City FTP Site #1

Open ghost opened 7 years ago

ghost commented 7 years ago

Files are uploaded to ftp://ftp.phila-records.com. The pipeline should start with a scraper that checks for new and changed files here and incorporates them into some data store.

sergeantbacon commented 7 years ago

Updating the issue to reflect our tasks for the CELaunchpad.

Script should compare the YTD finance file on the FTP site to what was last loaded to the database (can check date, filesize, etc.). If the file on the FTP site contains a newer dataset than what is in the database, download it.