CarperAI / Code-Pile

This repository contains all the code for collecting large scale amounts of code from GitHub.
MIT License
105 stars 29 forks source link

Github issue scraper #49

Open vanga opened 1 year ago

vanga commented 1 year ago

This PR contains Big query queries and Github graphql API scraping code for (pre-2015 issues + comments). This is not easily reusable code with easily pluggable configurations. This works and acts as a reference for the work done to prepare the github issues dataset.

Further work will need to be done into making the code clean and turn it into well abstracted APIs. Feel free to close the PR if something like this shouldn't exist in the main repo.