zczlsde / AlphaCode_reproduce

2 stars 0 forks source link

AlphaCode_reproduce

Link to the Paper

The overall structure of this reproduce project consists of 4 Encoders and 24 Decoders(6 Query Heads).

Pre-training

Link to the Dataset.

Dataset used in the original paper is not open-source so another snapshot of github code is found in datasets library and the GitHub Code dataset consists of 115M code files from GitHub in 32 programming languages with 60 extensions totaling in 1TB of data.