ArtemisDicoTiar / FastLLM

1 stars 0 forks source link

Meeting Note [11/23] #6

Open ArtemisDicoTiar opened 7 months ago

ArtemisDicoTiar commented 7 months ago

target task: summarization

distillation: teacher → student (draft model)

t5-xl: target, t5-small: drafter n-gram: ...?

Ngram: KD ngram should be trained with the model generated dataset.

Ngram model for specific domain or dataset specific ngram model

Todo

romsto commented 7 months ago

ToDo: 1 - implement models & algorithm to train parameter-wise models 재석 & 원표 2 - pseudo-dataset + build ngrams Romain 3 - speculative decoding (accepted tokens, inference speed?) 재진 4 - evaluation module John