facebookresearch / metaseq

Repo for external large-scale work
MIT License
6.51k stars 726 forks source link

QA about continue training on checkpoint #757

Open robinzixuan opened 2 months ago

robinzixuan commented 2 months ago

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

  1. Is it possible to help the checkpoints of OPT-1.3b model around 10K - 20K training step?
  2. By the way, if we want continue training on those checkpoint/ official checkpoint, is it possible to get the all training dataset meta used in OPT models?

    Code

#### What have you tried? #### What's your environment? - metaseq Version (e.g., 1.0 or master): - PyTorch Version (e.g., 1.0) - OS (e.g., Linux): - How you installed metaseq (`pip`, source): - Build command you used (if compiling from source): - Python version: - CUDA/cuDNN version: - GPU models and configuration: - Any other relevant information: