OpenThaiGPT / openthaigpt-pretraining

Apache License 2.0
21 stars 10 forks source link

[SIIT] Explore pre-training configurations #258

Open Wintoplay opened 1 year ago

Wintoplay commented 1 year ago

Explore pre-training configurations or other literatures

Example: Optimizer, dataset sampling, and vocab size of tokenizer tokenizer type: SPM vs BPE clean data thresold learning rate (MUP)

image

image

Evaluation with real evaluation set

Wintoplay commented 1 year ago

plan.md