polyglot-ko 모델 quantized lora tuning - Githubissues

GirinMan / HYU-Graduation-Project-Quantization

한양대학교 컴퓨터소프트웨어학부 졸업 프로젝트 진행용 레포지토리입니다.

Apache License 2.0

0 stars 0 forks source link

polyglot-ko 모델 quantized lora tuning #15

Open GirinMan opened 1 year ago

GirinMan commented 1 year ago

개요

LLM.int8() + LoRA를 활용한 memory&parameter efficient fine tuning
BitsAndBytes + Peft 활용한 모델 학습 예정
Backbone은 polyglot-ko-5.8b 활용(KoGPT는 special token 관련 이슈...)

학습 방식

각 task별 적절한 prompt를 쉽게 구현하기 위해, template based learning 활용
학습 실행 시 데이터셋, 프롬프트용 prefix, suffix 선택
한국어 모델인 만큼 NSMC 등 한국어 데이터셋 활용할 예정

GirinMan commented 1 year ago

Pull request(#17, #18) 업데이트 내용

generation task에 맞는 template based 데이터셋 구축
Classification task 학습 코드 구성 완료 및 테스스 진행(polyglot-ko-1.3b 모델 사용)
학습에 활용된 자원은 RTX 3090 24GB 1개
Huggingface에 업로드된 데이터셋을 datasets 라이브러리를 통해 불러와 사용 가능(nsmc, klue/ynat). config 파일, run shell script 참조

NSMC

1 epoch만에 nsmc 정확도 90% 달성, accuracy 기준 best epoch: 4(eval accuracy 91.156%, f1 0.911) wandb 리포트
학습 로그 2023-03-21_03-46-37_486145_train.log

KLUE YNAT

macro f1 및 accuracy로 평가. f1 기준 best epoch: 5(eval accuracy 84.8%, f1 0.8483) wandb 리포트
학습 로그 2023-03-21_10-02-03_065335_train.log