minzero31 / MadCamp-2week-frontend

madcamp week2 assignment frontend repository
MIT License
0 stars 1 forks source link

[Error] gemini 프롬프트 수정 #23

Open madcampnewbie opened 1 week ago

madcampnewbie commented 1 week ago

prompt = (f'Create seven exam questions based on the following text: "{ocr_text}". ' f'All questions and answer choices must be in Korean. ' f'Format each question as "#####. [질문]?". ' f'Generate multiple-choice questions with five answer choices' f'Format each choices as "$$$$$. [선택지 내용]".' f'At the end, provide the correct answer numbers(1 to 5) for each question as a string of 7 digits concatenated together.')

questions_list=[] options_list=[] answers_list=[] cnt=0 options=[]

        for sentence in questions:
            if "#" in sentence:
                questions_list.append(sentence.replace('*','').replace('#','').replace('.',''))
            elif "$" in sentence:
                cnt+=1
                if cnt<5:
                    options.append(sentence.replace('$', '').replace('.',''))
                else:
                    options.append(sentence.replace('$', ''))
                    options_list.append(options)
                    cnt=0
                    options=[]
            elif sentence!='':
                ex_sentence = extract_numbers(sentence)
                if ex_sentence!='':
                    answers_list.append(ex_sentence)
        answers_list = [int(char) for char in answers_list[0]]

프롬프트에서 문제, 선택지, 답안 별로 각각 포멧을 정할 때 generative model 이 포멧 형식을 중요하게 인식하도록 기호를 여러 번 강조해서 넣어주는 방식을 사용하면 답안이 제대로 나올 가능성이 높아짐.

madcampnewbie commented 1 week ago

OCR 결과에 한글이 아닌 기호가 들어있는 경우에 답안 작성이 형식을 벗어나는 경우가 많았음.