hpcaitech / ColossalAI-Examples

Examples of training models with hybrid parallelism using ColossalAI
Apache License 2.0
334 stars 102 forks source link

T5 example #41

Closed SaraBakic closed 2 years ago

SaraBakic commented 2 years ago

A pipeline for training T5 for a language modeling objective is created. It is done following the GPT pipeline from the examples

FrankLeeeee commented 2 years ago

Hi, I notice that this PR contains a lot of files for data pre-processing. These files already exist in the GPT folder. Can you remove these files and state that the user should use a symbolic link in your README.md .

SaraBakic commented 2 years ago

Hi, yes, the preprocessing structure is the same but it is adapted for T5 (everything related to tokenization is adapted to T5 rather than GPT). Some of the files are identical to ones in GPT folder (e.g. in tools/download). Should I only remove the folders that contain all files that already exist in the GPT folder and leave those that have something new due to dependencies or just remove all files that are identical to GPT?

feifeibear commented 2 years ago

Hi, yes, the preprocessing structure is the same but it is adapted for T5 (everything related to tokenization is adapted to T5 rather than GPT). Some of the files are identical to ones in GPT folder (e.g. in tools/download). Should I only remove the folders that contain all files that already exist in the GPT folder and leave those that have something new due to dependencies or just remove all files that are identical to GPT?

Hey, I noticed your great job on T5 example. You can just reuse the GPT code as much as possible.

yuxuan-lou commented 2 years ago

Also, can you provide train logs with different parallelism strategies?

binmakeswell commented 2 years ago

This PR is closed due to inactivity.