stephbuon / digital-history

Instructional repository for "Text Mining as Historical Method"
GNU General Public License v3.0
7 stars 3 forks source link

Create two new csv files for temporal events #36

Closed stephbuon closed 3 years ago

stephbuon commented 3 years ago

From the sentence_entities and entity_labels columns in /scratch/group/pract-txt-mine/hansard_justnine_12192019.csv create two csv files:

1) a csv file with just columns for sentence_id and DATE. Name this file: hansard_ner_time.csv.

2) a csv file with just columns for sentence_id and EVENT. Name this file: hansard_ner_event.csv.

Please format the csv files so that each row has one sentence_id and one named entity. If a single sentence has multiple named entities, you can repeat the sentence_id multiple times.

Example:

ID_123 , 1985 ID_123, 1876

Please give Steph the code on Slack when you are finished. The code can be a regular py script, or whatever you prefer.

Please save the exported files to /scratch/group/pract-txt-mine.