This repository contains code and data about people and organizations. Potential uses include training and evaluation data sets to:
Entity | Source | Download |
---|---|---|
Physician | CMS | CSV |
Author | Open Library | CSV |
Academic author | Open Academic Graph | CSV |
Person | Wikidata | CSV |
Person: Nicknames | onyxrev | CSV |
Voter | Florida Voter Registration | CSV |
Voter | North Carolina Voter Registration | CSV |
Church | Wikidata via SPARQL | CSV |
Licensee | US States | CSV |
Inmate | Florida | CSV |
Deceased | Veterans Affairs | CSV |
Public school | California Department of Education | CSV |
College | US Department of Education | CSV |
Radio and TV station | Wikidata | CSV |
PetScan is a simple way to get a list of articles in a category from Wikipedia. For more advanced use, SPARQL might be better.
This is an example of how to export a list of articles in a category from PetScan.
The CSV includes the Wikidata IDs, which can be fed to the script wikidata_org.py
here to look up their metadata.
The original and processed data sets can be very large, so most data sets are not committed to this repository. Please use either
Written by Andrew Ziem. Copyright (c) 2017-2020 Compassion International.
The code is licensed under the GNU General Public License version 3, and the data sets belong to the original data owners. Please consult the original sources for data licenses.
While this repository contains names of entities and some other metadata, this repository does not contain any contact information: mailing address, email address, telephone number, etc.