hasadna / knesset-data-pipelines

Main repository for Open Knesset project - contains the knesset data scrapers and processing pipelines
https://oknesset.org/
MIT License
14 stars 26 forks source link

process for manual classification of committee protocol parts using Catma #179

Closed OriHoch closed 1 year ago

OriHoch commented 4 years ago

catma allows to manually clasify parts of committee protocols

Bar Ilan University (BIU) uses it to classify parts of the protocol according to the following tags:

In Catma, documents are original committee protocol files (uploaded directly by BIU staff), which Catma parses to pure text. Each document can contain multiple annotations by different users (currently, in BIU, a single user annotates each document)

To parse results of the classification and integrate back to knesset data:

OriHoch commented 4 years ago

code for parsing of the results: show notebook