DagsHub / open-source-ml-datasets

This repository holds open source datasets for various machine learning domains with a link to download and use them
https://dagshub.com/DagsHub/open-source-ml-datasets
8 stars 8 forks source link

Dataset Card for CodeSearchNet corpus #18

Closed Sookeyy-12 closed 1 year ago

Sookeyy-12 commented 1 year ago

CodeSearchNet corpus is a dataset of 2 milllion (comment, code) pairs from opensource libraries hosted on GitHub. It contains code and documentation for several programming languages. the dataset can be found here: https://huggingface.co/datasets/code_search_net

dagshub[bot] commented 1 year ago

Join the discussion on DagsHub!