ncopenpass / CampaignFinanceDataPipeline

Data Pipeline for NC Campaign Finance Dashboard
Apache License 2.0
2 stars 4 forks source link

Download election dataset raw files #15

Closed davidpeckham closed 3 years ago

davidpeckham commented 3 years ago

I added a new pipeline step to download the dataset and hard-coded the list of raw files. We could easily move the file list to a separate config file and roll this into the existing import step.

davidpeckham commented 3 years ago

Any feedback on this?

ChrisTheDBA commented 3 years ago

Need additional research on the approach. The use of a static list of files is not helpful. If a dynamic list can not be generated without. elevated privileges then we will better document how to acquire the files.

davidpeckham commented 3 years ago

Chris, I think we should merge this PR now to make it easier for developers to get started with the project. We can always improve it later.

We need to think about making it easier for developers to get started with the project, and reliably run the pipeline. The Jupyter notebooks are a good way to prototype the pipeline, but they're long, complex, poorly documented, and prone to mistakes. If people are going to trust the results, we've got to make it more repeatable and reliable.