In this project, we will attempt to recreate GitHub Copilot. Since properly recreating the service requires an immense amount of compute for model training, we will seek to build a Mini Copilot.
This repository contains both educational materials (starter notebooks for basic training) and our actual implementation of recreating GitHub Copilot. Education materials can be found under education/
, and our implementation can be found under src/
.
Accompanying educational slide decks can be found here (requires UMich login).
Recreating GitHub Copilot involves various parts -- the underlying code completion model, the endpoint serving the model, and the actual application (VSCode extension) using the model.
gpt-2
on a code completion task (code found under src/model
).src/backend
).registerInlineCompletionItemProvider
VSCode API (full implementation found under src/extension
). This extension calls our Lambda endpoint to supply the code completion.Week | Date | Topic |
---|---|---|
1 | 9/22 | Project Overview + Causal Language Modeling (CLM) w/ n-grams |
2 | 9/29 | CLM + High Performance Computing (HPC) |
3 | 10/6 | CLM Continued + Model Evaluation |
4 | 10/20 | Masked Language Modeling (MLM) |
5 | 10/27 | Model Deployment |
6 | 11/3 | Creating a VSCode Extension |
7 | 11/10 | Buffer Week / Going deeper |
8 | 11/17 | Final Expo Prep |
Project Members:
Project Leads: