karpathy / minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
MIT License
20.3k stars 2.53k forks source link

Data preprocessing #142

Closed umertens closed 2 months ago

umertens commented 2 months ago
  1. Split data preprocessing from training
  2. Move data-preprocessing and embedding generation to individual classes
  3. Simplify file storage to enable training on larger datasets
letmestudy commented 2 months ago

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

umertens commented 2 months ago

Apologies, this PR was opened by mistake. Closing now!