issues
search
OpenThaiGPT
/
openthaigpt-pretraining
Apache License 2.0
21
stars
10
forks
source link
feat(model): load hf dataset from local in spm_trainer
#189
Closed
boss-chanon
closed
1 year ago
boss-chanon
commented
1 year ago
Why this PR
Spm trainer can load hf dataset in local
Changes
spm trainer will load dataset as hf format when data_type is None
Related Issues
Close #
Checklist
[ ] PR should be in the
Naming convention
[ ] Assign yourself in to Assigneees
[ ] Tag related issues
[ ] Constants name should be ALL_CAPITAL, function name should be snake_case, and class name should be CamelCase
[ ] complex function/algorithm should have
Docstring
[ ] 1 PR should not have more than 200 lines changes (Exception for test files). If more than that please open multiple PRs
[ ] At least PR reviewer must come from the task's team (model, eval, data)
Why this PR
Spm trainer can load hf dataset in local
Changes
Related Issues
Close #
Checklist