[Feature] Export the evaluation summary table in csv format to the outputs folder

Udayraj123 commented 2 days ago

Is your feature request related to a problem? Please describe. Currently when we use a evaluation.json file with should_explain_scoring as true, a table is printed in the console that explains how the scoring happened for a particular OMR sheet -

This is particularly useful for debugging complex answer keys with features such as section-wise marking, negative marking, bonus questions etc. (ref for all capabilities)

Now there's no way to output this table as a separate CSV file per OMR sheet.

Describe the solution you'd like

Create a new folder named as evaluation/ inside the outputs directory for each run (using the outputs namespace).
Output filename can follow the same as input file name replacing the extension .jpg with .csv
Check through the code trail for explanation_table in evaluation.py file and maintain a new object that holds the content for the csv. Preferably a pandas dataframe.
We can use the to_csv util from pandas for exporting the csv from the evaluation class itself
Add a flag to control this in the evaluation_schema.py file

Describe alternatives you've considered N/A

Additional context

Drita-ai commented 2 days ago

Hello @Udayraj123 ! I'd like to work on this issue. Please assign it to me.

Udayraj123 commented 2 days ago

Hey @Drita-ai, sure. Please share an approach that you'd like to take after going through the resources mentioned. Once you're clear on what needs to be done I can assign the issue to you

Anshu370 commented 2 days ago

Hello @Udayraj123. I like to help you in this. Please assign it to me. Thank you!

Udayraj123 commented 2 days ago

Hey @Anshu370, sure. Please share an approach that you'd like to take after going through the resources mentioned. Once you're clear on what needs to be done I can assign the issue to you

offline-keshav commented 1 day ago

@Udayraj123 I would like to work on this issue, so basic idea is that we can use pandas lib to manage csv files and store it it new folder as per instructions provided above, but I am unable to understand the codebase and how it is working. If there is any readme/instruction files which explain codebase it will be very helpful since understanding the complexity of this code just by understanding each program will be a tough task.

Drita-ai commented 1 day ago

Hey @Drita-ai, sure. Please share an approach that you'd like to take after going through the resources mentioned. Once you're clear on what needs to be done I can assign the issue to you

Hi @Udayraj123 , After going through the resources that you've mentioned, here's my proposed approach :

I'll go through the evaluation.py to locate the section of code that generates the evaluation explanation table. As you can see I've already done that :
Then I'll have to create a new directory named evaluation inside ouputs/. I have already tried to implement this too with the help of file.py file, where on every run of code I am trying to create new evaluation directory with no *.csv files.
And then I'll have to export that evaluation explanation table to csv file with the help of pandas, keeping the csv's file name same as the jpg file from which it was generated.
And finally I'll have to create a flag to control this in the evaluation_schema.py file.

Udayraj123 commented 1 day ago

Looks like there's multiple folks working on this. Since @Drita-ai has shared the approach on this issue first, will have to assign the issue to @Drita-ai.

@tushar-badlani your efforts are appreciated but please make sure to get yourself assigned on the issue first before working on a PR directly. Especially when others have already commented and shown interest to work on it. I'll have to keep the PR on on hold now.

@Drita-ai it's upto you now if you wish to continue working on this issue or a new one. Either way I'll make sure both of your efforts are given due credit.

Udayraj123 commented 1 day ago

@offline-keshav the code is better structured in the dev branch. And currently the variable naming and code comments are the guide for understanding. One good way to start with is going through the schema/ folder. Understanding basic keys in the template.json first.

There's no separate docs yet(we have an issue created for this).

tushar-badlani commented 1 day ago

@Udayraj123 I appreciate the feedback and the guidance. I'll definitely make sure to get assigned to issues before working on them in the future.

Udayraj123 commented 1 day ago

If it helps to know @tushar-badlani, your code may already get picked up by someone facing the issue recently. I've asked them to try it out and let us know the feedback on discord channel

Drita-ai commented 1 day ago

Hi @Udayraj123, it's pretty late to ask this but after following the Getting Started guidelines where it suggested to fork the OMRChecker, which creates a origin/master (in my github repo) branch with the code from upstream/master branch. So do I have to merge the origin/master with upstream/dev before working or i can continue with the default settings.

Udayraj123 commented 18 hours ago

@Drita-ai at the time you created a fork, you'd work on the origin/master branch which is already same as upstream/master For this task, it's okay to directly raise a PR on master branch, so no need to merge it with dev yet.

Udayraj123 / OMRChecker

[Feature] Export the evaluation summary table in csv format to the outputs folder #221