Open imrigzz opened 1 year ago
implementation is the most important part from the RFC so you need to elaborate a more into point of what you will need to or plan to do and also you dont need to write jsonl parser there is already one present at https://pypi.org/project/jsonlines/ and then you can start working by taking the jsonl that kaldan has uploaded in the RFW
by looking at the flowchart u need to update ur implementation.
Work Planning
Details
> ## Table of Contents > * [Housekeeping](#housekeeping) > * [Owner](#owner) > * [Summary](#summary) > * [Is This Really Necessary?](#is-this-really-necessary) > * [Motivation](#motivation) > * [Named concepts](#named-concepts) > * [Examples](#examples) > * [Conceptual Design](#conceptual-design) > * [Drawbacks](#drawbacks) > * [Alternatives](#alternatives) > * [New Data](#new-data) > * [Adoption Window](#adoption-window) > > ## Housekeeping > Make sure to clearly understand Type-A and [Type-B](https://docs.google.com/document/d/17RHdAuJep5GsirwL7vEbnE1qX9zdmf67YioNQxA-c-k/edit#heading=h.yrnebqnrvkpj) requests, and the relavant limitations. Failing to follow the guidelines pertaining to the two acceptable types of RFWs will automatically lead to the disqualification of the RFW. > > Take time to complete each section below with as much detail as is required to establish a comprehensive understanding of the underlying product specification. > > **ALL BELOW FIELDS ARE REQUIRED** > > > ## Summary > We need a report on how many lines has been segmented by all the annotators in a group on our prodigy platform. With this report we would like to pay them as per lines they have segmented. > > ## Is This Really Necessary? > Yes. It is. without such report generator it is impossible to pay them according to lines they have segmented. > > ## Motivation > We need this report generator to have clarity regarding how we are paying our anntators. > > ## Named Concepts > **Prodigy**: [annotating platform. ](https://prodi.gy/) > > ## Conceptual Design > We have a jsonl file containing all the annotation entries. We need a parser to parse how many lines has been segmented by each annotators. > > 1. Input: The parser will take a JSONL file containing annotation entries as input. > 2. Output: The parser will provide a count of the lines segmented by each annotator. > > ### Approach > 1. Open the JSONL file for reading. > 2. Iterate over each line in the JSONL file: a. Extract the annotator ID from the line. b. Update the line segmented count for the current annotator in the dictionary if "answer":"accept" in jsonl line. c. Count the number of line spans in the current line. d. Also Update rejected and ignore counts for the current annotator in the dictionary if "answer":"reject/ignore". > 3. Print/export the counts for each annotator. > > ### Data storage format > >![annotators](https://github.com/OpenPecha/Requests/assets/63168284/8bf50945-be2f-4c2a-8e5a-f271eb80ddd4) > >### Flow chart > >![image](https://github.com/OpenPecha/Requests/assets/63168284/e212f793-bd94-413a-89b2-6e7a11209bbf) > > ## Drawbacks > _What are the possible drawbacks of this? Think carefully about how the proposed work will affect what we already have, and the possible ways in which it might end up limiting us in the future, or take us in directions that become diversions from our mission._ > > ## Alternatives > _If applicable, explain what alternatives are there available and known to us that would allow achieving the same or similar business value. Particularly pay attention to the 80/20 rule here i.e. alternatives where we might get 80% of the business value with 20% of the work._ > > ## New Data > _If applicable, explain clearly the new data artifacts that will result from implementing this proposed work._ > > ## Adaption Window > A rough timing for the planned release of the specification possibly resulting from this request.Work Phases
Planning
Keep original naming and structure, and keep it as the first section in the Work Phases section
Implementation
A list of checkboxes, one per PR. Each PR should have a descriptive name that clearly illustrates what the work phase is about.
Completion