CivicActions / edscrapers

US Department of Education Data Scraping Kit; see https://us-ed-scraping.ckan.io/dataset
GNU Affero General Public License v3.0
15 stars 9 forks source link

Filter ZIP files by contents #111

Open nightsh opened 4 years ago

nightsh commented 4 years ago

[STUB]

While scraping, we can download, extract and investigate the contents of the ZIP files to eliminate zipped resources that don't contain data files.

This would likely be a part of the Airflow DAGs and would likely take a long time to complete (but however, shorter than doing it manually).

To Be Continued...