Open bmesuere opened 9 months ago
Some old code I wrote to generate answers based on questions as a stand-alone script:
import OpenAI from "openai";
import { JSDOM } from 'jsdom';
const dodonaHeaders = new Headers({
"Authorization": ""
});
const openai = new OpenAI({
apiKey: ""
});
const systemPrompt = "Your goal is to help a teaching assistant answer student questions for a university-level programming course. You will be provided with the problem description, the code of the student, and the question of the student. Your answer should consist of 2 parts. First, very briefly summarize what the student did wrong to the teaching assistant. Second, provide a short response to the question aimed at the student in the same language as the student's question.";
const questionId = 148513;
async function fetchData(questionId) {
// fetch question data from https://dodona.be/nl/annotations/<ID>.json
let r = await fetch(`https://dodona.be/nl/annotations/${questionId}.json`, {headers: dodonaHeaders});
const questionData = await r.json();
const lineNr = questionData.line_nr;
const question = questionData.annotation_text;
const submissionUrl = questionData.submission_url;
// fetch submission data
r = await fetch(submissionUrl, { headers: dodonaHeaders });
const submissionData = await r.json();
const code = submissionData.code;
const exerciseUrl = submissionData.exercise;
// fetch exercise data
r = await fetch(exerciseUrl, { headers: dodonaHeaders });
const exerciseData = await r.json();
const descriptionUrl = exerciseData.description_url;
// fetch description
r = await fetch(descriptionUrl, { headers: dodonaHeaders });
const descriptionHtml = await r.text();
const description = htmlToText(descriptionHtml);
return {description, code, question, lineNr};
}
async function generateAnswer({description, code, question, lineNr}) {
const response = await openai.chat.completions.create({
model: "gpt-4",
messages: [
{"role": "system", "content": systemPrompt},
{"role": "user", "content": `Description: ${description}\nCode: ${code}\nQuestion on line ${lineNr}: ${question}`}
]
});
console.log(response);
console.log(response.choices[0].message);
//return gptResponse.data.choices[0].text;
}
function htmlToText(html) {
const dom = new JSDOM(html);
const text = dom.window.document.body.textContent
.split("\n")
.map(l => l.trim())
.filter(line => !line.includes("I18n"))
.filter(line => !line.includes("dodona.ready"))
.join("\n");
return removeTextAfterSubstring(text, "Links").trim();
}
function removeTextAfterSubstring(str, substring) {
const index = str.indexOf(substring);
if (index === -1) {
return str; // substring not found
}
return str.substring(0, index);
}
const data = await fetchData(questionId);
console.log(data);
await generateAnswer(data)
I tested the runtime performance of a few models on my mac studio (64GB memory):
Model | Quantization | Memory usage | Inference |
---|---|---|---|
codellama-34b-instruct | Q5_K_M | 22.13 GB | 9.87 tok/s |
codellama-34b-instruct | Q6_K | 25.63 GB | 9.58 tok/s |
codellama-34b-instruct | Q8_0 | 33.06 GB | 9.32 tok/s |
codellama-70b-instruct | Q4_K_M | 38.37 GB | 7.00 tok/s |
codellama-70b-instruct | Q6_0 | 49.39 GB | crashed |
mixtral-8x7b-instruct | Q5_K_M | 29.64 GB | 21.5 tok/s |
I could not validate the output of codellama-70b since it seems to use a different prompt format.
I played around with the various models this afternoon. Some early observations:
With the increasing capabilities of LLMs, it is only a matter of time before they become powerful/cheap enough to use them inside Dodona. A first step might be to generate draft answers for questions from students. Here's how it might function:
This approach minimizes risk since each AI-generated answer undergoes human review and editing. Moreover, it's not time-sensitive. If the AI draft is inadequate or fails, the situation remains as it is currently. However, the potential time savings could be substantial.
Since this would be our first LLM integration, this will involve some research aspects.