18F / privacy-tools

GSA PII Dashboard
https://cg-9341b8ea-025c-4fe2-aa6c-850edbebc499.app.cloud.gov/site/18f/privacy-dashboard/
MIT License
2 stars 4 forks source link

DevOps for Privacy Offices

We envision a future in which the public can easily understand how and why personally identifiable information gets collected by government agencies.

To get there, we're working with federal privacy offices and structuring data from PDFed privacy-related compliance documents. By structuring data, we're equipping privacy offices with the ability to more quickly search through these documents, reducing unnecessary manual practices and laying a foundation for them to more easily collaborate with engineering teams.

This project is funded by 10x.

Privacy Dashboard development repo here

Our phase three work is happening in partnership with the GSA's Privacy Office.

Install

The scraping code is written in Python and runs locally. We recommend creating a virtual environment using virtualenv to install and manage the required Python libraries. Run these commands in the repository directory on your machine to create a local virtual environment, start it, and then install all requirements.

virtualenv .venv
source .venv/bin/activate
pip install -r requirements.txt

Scraping Data

Running python sorn_scraper.py does the following: