Open ArtemisDicoTiar opened 11 months ago
ToDo: 1 - implement models & algorithm to train parameter-wise models 재석 & 원표 2 - pseudo-dataset + build ngrams Romain 3 - speculative decoding (accepted tokens, inference speed?) 재진 4 - evaluation module John
target task: summarization
distillation: teacher → student (draft model)
t5-xl: target, t5-small: drafter n-gram: ...?
Ngram: KD ngram should be trained with the model generated dataset.
Ngram model for specific domain or dataset specific ngram model
Todo