ArtemisDicoTiar / FastLLM

2 stars 1 forks source link

Meeting Note [11/23] #6

Open ArtemisDicoTiar opened 11 months ago

ArtemisDicoTiar commented 11 months ago

target task: summarization

distillation: teacher → student (draft model)

t5-xl: target, t5-small: drafter n-gram: ...?

Ngram: KD ngram should be trained with the model generated dataset.

Ngram model for specific domain or dataset specific ngram model

Todo

Ngrams: this need pseudo dataset
LSTM
CNN
t5-small

romsto commented 11 months ago

ToDo: 1 - implement models & algorithm to train parameter-wise models 재석 & 원표 2 - pseudo-dataset + build ngrams Romain 3 - speculative decoding (accepted tokens, inference speed?) 재진 4 - evaluation module John