autograder-org / autoGrader-frontend

An automated assignment grading system that leverages LLMs and AI to enhance grading efficiency and reliability. It includes modules for data input, criteria definition, AI integration, consistency checks, and comprehensive reporting, aimed at improving educational outcomes.
https://autograder.dev
5 stars 5 forks source link

Evaluate Efficiency of LLMs to grade assignments using Prompting Techniques #11

Open parthasarathydNU opened 5 months ago

parthasarathydNU commented 5 months ago

Objective

To experimentally determine the effectiveness of Large Language Models (LLMs) in grading various types of assignment submissions and to assess their performance relative to human graders.

Background Resources

Experimental Design

  1. Hypothesis Formation: Develop clear hypotheses based on the potential outcomes of LLMs in assignment grading. Here is a Wiki Document for starters Hypotheses for Testing Automated Assignment Grading Software

  2. Data Collection:

    • Conduct tests using both the whole assignment and segmented parts.
    • Collect grading outcomes from LLMs and compare these with benchmarks set by human graders.
  3. Analysis:

    • Use statistical methods to analyze the data collected and validate the hypotheses.
    • Focus on evaluating how closely LLM grading aligns with human grading standards.

Goals

Future Work

Expected Outcomes

This issue aims to methodically assess the capabilities of LLMs in an educational setting, focusing on their potential to enhance or replace traditional grading methods while maintaining or improving grading accuracy and personalization.