CityOfLosAngeles / data-workflows-101

A workshop and training for data workflows that we use at the City.
Apache License 2.0
4 stars 9 forks source link

Create updated CAP master dataset using Eventbrite sign-ins #17

Open igotcharts opened 6 years ago

igotcharts commented 6 years ago

The goal is to connect the Eventbrite CAP attendance records with the candidate's progress in the hiring process, and their current LAPD sworn status. This will require combining three different datasets without a common key. Via a combination of first name/last name/last four of social and fuzzy matching, we will pull together all sheets as best as possible.

jannasmith commented 6 years ago

What kind of identification information is collected via the Eventbrite registration process for each event? Do all Eventbrite events collect the same information?

igotcharts commented 6 years ago

I know Eventbrite sheets contain first name, last name, and last four of social. I'm pretty sure they also contain e-mail and phone numbers.

There are currently three ongoing Eventbrite events, each representing a different training location. Each event collects the same information.

The main problem is that the information within the hiring process datasets, while the same fields, often contain slightly different entries (e.g. in Eventbrite, one person's last name may be Anderson but in the hiring process dataset it may be Anderson III).