Closed twinkarma closed 2 years ago
Is there only ever one right answer?
@twinkarma, a few questions already:
- Add
test_documents
to Project model- Add
train_document
to Project model
Hadn't we planned to have separate tables for training and test documents? We may have decided against this and I've forgotten..
So far, what I've done (in #169 ) is to create classes for training and test documents which inherit from a base document class. (Note that to do this, I've moved all the properties and methods from Document()
to BaseDocument()
then recreate the former as Document(BaseDocument)
- since it turns out that you can't overwrite properties in child classes (e.g. the project
field)
I'm not sure what your intention is with the annotator_max_train_score
and annotator_max_test_score
fields?
For num_annotations
- would it be better to compute this on the fly? Or did we decide that this would be too slow and to update this property with each annotation?
return this.test_documents.all().count()
and same for training scoreFor the annotator_max_train_score
and annotator_max_test_score
I've just changed the field name to num_training_documents
and num_test_documents
, probably makes more sense this way.
Are we good to close this issue?
Overview
Allows the uploading of additional pre-annotated datasets for training and testing annotators before they can annotate for real. This is a common process used when recruiting annotators, for example https://crowd.cochrane.org/.
Training mode
Testing mode
Labelling answers
Documents in testing and training mode will have
More information on annotator page of what stage they're on
Subclass Document class to separate training, testing and real annotation task set?
Record whether annotator has completed training, testing or main annotation task, better way of recording projects that annotator has worked on
UI
Tasks
test_documents
to Project modeltrain_document
to Project modelLabelling documents with answers (gold standard)
For test and training documents, answers and explanations should be included in the JSON/csv they upload following the format below:
So in csv the columns will be
id | text | gold.[labelName].value | gold.[labelName].explanation
* Replace [labelName] with the name of the actual label specified in project configuration