OpenThaiGPT / openthaigpt-pretraining

Apache License 2.0
21 stars 10 forks source link

feat(model): Add Huggingface trainer #311

Closed boss-chanon closed 11 months ago

boss-chanon commented 11 months ago

Why this PR

Huggingface trainer efficient more than old pipeline @new5558 approve have more than 200 lines changes

Changes

Related Issues

Close #310

Checklist

codecov[bot] commented 11 months ago

Codecov Report

All modified lines are covered by tests :white_check_mark:

Comparison is base (361a98f) 19.39% compared to head (957e2b6) 19.39%. Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #311 +/- ## ======================================= Coverage 19.39% 19.39% ======================================= Files 25 25 Lines 1392 1392 ======================================= Hits 270 270 Misses 1122 1122 ``` | [Flag](https://app.codecov.io/gh/OpenThaiGPT/openthaigpt-pretraining/pull/311/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=OpenThaiGPT) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/OpenThaiGPT/openthaigpt-pretraining/pull/311/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=OpenThaiGPT) | `19.39% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=OpenThaiGPT#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

new5558 commented 11 months ago

I think we only need deepspeed code for now. please remove FSDP related submission code.

new5558 commented 11 months ago

image image

Please adhere to model training requirement

new5558 commented 11 months ago

Please run through Model Training Checklist especially resume from checkpoint section