Closed lilehman closed 3 years ago
[Discussion/Question] Since the records will be assigned to annotators in batches with a due date associated with each assignment, how do we keep track of the assignment date and due date? in the same annotator-record assignment mapping CSV file? Or in a separate file or database?
[Organization of the annotator-record CSV mapping file] Since there will be multiple datasets/folders for each project (e.g. GE, demo-data, and later on DWC data), should we assume that each dataset folder will have one CSV file to define the annotator-record assignment for records in that particular folder?
[Discussion/Question] Since the records will be assigned to annotators in batches with a due date associated with each assignment, how do we keep track of the assignment date and due date? in the same annotator-record assignment mapping CSV file? Or in a separate file or database?
Hmmmm good question! It would be easy to keep it in the database so it can be easily manipulated and accessed, but I'm thinking about how to implement this. Maybe when the user selected the "I want to start annotating" or "I want more annotations" a timestamp is created for when that request was and also for when that request is due (something like two weeks later). If it's in the file, we would have to keep it updated by writing to it each time this request happens which might get confusing and more tedious.
[Organization of the annotator-record CSV mapping file] Since there will be multiple datasets/folders for each project (e.g. GE, demo-data, and later on DWC data), should we assume that each dataset folder will have one CSV file to define the annotator-record assignment for records in that particular folder?
I think each dataset should have one mapping / assignment CSV file so that it makes more sense and we don't have to add another column stating which project the records and events belong to.
[Discussion/Question] Since the records will be assigned to annotators in batches with a due date associated with each assignment, how do we keep track of the assignment date and due date? in the same annotator-record assignment mapping CSV file? Or in a separate file or database?
Hmmmm good question! It would be easy to keep it in the database so it can be easily manipulated and accessed, but I'm thinking about how to implement this. Maybe when the user selected the "I want to start annotating" or "I want more annotations" a timestamp is created for when that request was and also for when that request is due (something like two weeks later). If it's in the file, we would have to keep it updated by writing to it each time this request happens which might get confusing and more tedious.
I agree with this. It wouldn't be much trouble adding due date info into the database. When admins accept record requests, maybe we could give them an option to specify the due date (as well as giving them a default due date of "x weeks from now"). The way I see it, the difficult part is assigning due dates when the CSV file is updated externally, since users may not have made the request for more records. I also think only assigning based on a default value for the due date would be somewhat problematic, since we may be assigning more records than the annotators can do in that timespan (i.e. we could end up assigning a user 1,000 records and have that due in a week). Maybe we could use a row in the CSV file to include a rate of how many annotations we should expect to get done in a week, and base due dates off of that?
[Organization of the annotator-record CSV mapping file] Since there will be multiple datasets/folders for each project (e.g. GE, demo-data, and later on DWC data), should we assume that each dataset folder will have one CSV file to define the annotator-record assignment for records in that particular folder?
I think each dataset should have one mapping / assignment CSV file so that it makes more sense and we don't have to add another column stating which project the records and events belong to.
Either way could work, but I do think separating the data by providers is the way to go. It'll be easier to change how data from different vendors is treated, not to mention it'll be more organized.
Closed by #25
Load the record assignment for each annotator from a CSV file.
(The CSV record assignment file will be generated by another script that we run periodically.)
Record assignment file format: Option 1: One CSV assignment file format, two-columns: <annotator_user_name, record_name> Option 2: one record assignment CSV file per annotator