EleutherAI / project-menu

See the issue board for the current status of active and prospective projects!
65 stars 4 forks source link

[Project] GPT-NeoX: an open-source framework for training language models with billions of parameters #12

Closed StellaAthena closed 1 year ago

StellaAthena commented 3 years ago

Project: GPT-NeoX

Codebase and Materials: codebase, training data

Project Lead(s): Sid Black (@sdtblck)

Currently Active Members: Alex Andonian (@alexandonian), Stella Biderman (@StellaAthena), Sid Black (@sdtblck), Preetham Gali (@preethamgali), Shivanshu Purohit (@ShivanshuPurohit)

Elevator Pitch: Massive language models like GPT-3 are incredibly powerful tools for research and industry alike. As they tend to be very expensive to develop, the groups that own them are very hesitant to share them with the public. Our goal is to train a suite of massive language models ranging in size from 1B to 200B parameters and make the pretrained models freely available for anyone to use.

Goal Outputs:

  1. An open source codebase that is capable of training, evaluating, and distilling GPT-3-style language models as large as 200B parameters.
  2. Pretrained model checkpoints of a variety of sizes all the way up to 200B parameters.
  3. Several academic papers based on the results of our work including "Lessons Learned Training a 200B Parameter Language Model on Commodity Hardware" and "Scaling Laws for Distilling Language Models."

Milestones:

Current Status: Preetham and Stella are implementing distilling functionality Alex and Sid are working on various minor fixes and adding new features Shivanshu is working on the eval harness integration

How to Help: Check out the open issues.

Desired Support: We always need more GPUs