stephbuon / hansard-speakers

A data processing pipeline to disambiguate speakers in the 19th-century British Parliamentary debates.
MIT License
1 stars 1 forks source link

Data expansion guidelines #174

Closed stephbuon closed 2 years ago

stephbuon commented 2 years ago

Download as csv and then reupload to make all fonts et. al the same.

Fix the columns the way you'd like.

stephbuon commented 2 years ago

Add data expansion guidelines

stephbuon commented 2 years ago

guideline for matching wih multiple speakers

stephbuon commented 2 years ago

Names with initials are not included in the fuzzy matching process because a single letter change–usually the initial–is sometimes used to distinguish between two speakers who have the first and last name but are different people.

It would be great if we could still perform fuzzy matching with these names, but ignore the middle initial.