Closed Amber-Chaeeunk closed 2 years ago
tokenizing
data collating
pad_to_max_length
True
[x] roberta - extractive model
settings in ModelArguments
model_name_or_path
klue/roberta-large
reader_type
extractive
underline
architectures
RobertaForQAWithUnderline
[ ] bart - generative model (extractive model에 집중)
hyunwoongko/kobart
generative
BartForCGWithUnderline
kiyoung2/roberta-large-qaconv-sds
punctuation
top_k_punctuation
n
Underline Embedding
Added underline embedding at the time of
tokenizing
(not at the time ofdata collating
)pad_to_max_length
:True
list of models that can use underline embedding
[x] roberta - extractive model
settings in ModelArguments
model_name_or_path
:klue/roberta-large
reader_type
:extractive
underline
:True
architectures
:RobertaForQAWithUnderline
[ ] bart - generative model (extractive model에 집중)
settings in ModelArguments
model_name_or_path
:hyunwoongko/kobart
reader_type
:generative
underline
:True
architectures
:BartForCGWithUnderline
Punctuation
encoder model :
kiyoung2/roberta-large-qaconv-sds
Retrieval로 가져온 Top-k개의 context들을 하나의 context로 합친 후, question과 유사도가 높은 Top-k개의 sentence에 punctuation 추가
punctuation
:True
top_k_punctuation
:n
(default : n=5)