ihmeuw / pseudopeople

pseudopeople is a Python package that generates realistic simulated data about a fictional United States population, designed for use in testing entity resolution (record linkage) methods or other data science algorithms at scale.
https://pseudopeople.readthedocs.io
BSD 3-Clause "New" or "Revised" License
20 stars 2 forks source link

[Data access request]: Krista Park - Data for Home Use #475

Closed discourse2data closed 2 weeks ago

discourse2data commented 3 weeks ago

What is the name of your project?

User Experience and Open Source Experiments

What is the purpose of your project?

Gain an understanding of how users outside of a corporately-managed computing system experience using pseudopeople data with open source record linkage / entity resolution software.

Who is involved in the project? Which of these people will have direct access to the pseudopeople input data?

Krista Park. No other users will have access.

What funding is the project under? What expectations with respect to open access and access to data come with that funding?

Self-funded. No outside dollars. Results of work may contribute to record linkage projects completed for employer: U.S. Census Bureau.

We commit to:

What data would you like to request?

Other data - more explanation

Expect to request either an additional state or full US in the future.

Ironholds commented 3 weeks ago

Sounds super interesting! Can you say more about how the user testing is going to be undertaken, or what you're looking for with users?

discourse2data commented 3 weeks ago

Given the need to set-up the pseudopeople examples to mirror (within the limits of appropriate privacy protection and disclosure avoidance) internal-Census data sets, most likely the first round of user testing will be by people with approved access to Census data (federal employees, contractors, and other special sworn status users). Although, those that are able, will be asked to try the instructions outside the Census environment using pseudopeople.

Then, after publication, we anticipate needing to accept input and potentially revise code and or procedures via comments received via multiple paths, including Github to ensure clear communications regarding the methodologies involved in numerous code and concept releases.

Ironholds commented 2 weeks ago

Thanks for the clarifications! LGTM, @aflaxman

aflaxman commented 2 weeks ago

Great, you are approved for access. Please email me at abie@uw.edu with the email you want to use to proceed. :)