CarperAI / Code-Pile

This repository contains all the code for collecting large scale amounts of code from GitHub.
MIT License
105 stars 29 forks source link

Add datasheets for all datasources #47

Open ncoop57 opened 1 year ago

ncoop57 commented 1 year ago

Follow work in data documentation space such as https://arxiv.org/abs/1803.09010 and https://arxiv.org/abs/2201.07311

We will be basing our documentation off the template from huggingface: https://github.com/huggingface/datasets/blob/main/templates/README.md