Welcome to DagsHub’s non-code contribution project for Hacktoberfest 2023!
In this exciting Hacktoberfest challenge, DagsHub invites you to join us in enriching the open-source dataset domain and enhancing its accessibility and capabilities for the global machine-learning community.
DagsHub is a centralized platform to host and manage machine learning projects including code, data, models, experiments, annotations, model registry, and more! DagsHub does the MLOps heavy lifting for its users. Every repository comes with configured S3 storage, an experiment tracking server, and an annotation workspace - all using popular open-source tools like MLflow, DVC, Git, and Label Studio.
Your mission is to import datasets from various sources, such as Kaggle, Hugging Face, or any other relevant platforms, and integrate them into DagsHub. Hosting those datasets on DagsHub exposes them to our Data Engine, unlocking unique data management capabilities such as query, visualize, annotate, and streaming for ML training. Not only that, by adding crucial information and context to these datasets, you'll significantly boost their accessibility and usability.
To simplify this process, we've created a user-friendly Colab notebook that will do the import for you! Here's a quick overview of what you need to do:
Add a README.md file (e.g., Librispeech ASR corpus) to the repository on DagsHub with the following information:
Note: You can create a markdown file locally, upload it to DagsHub from the repository UI, and edit it from DagsHub - no need for coding whatsoever!
dataset
, hacktoberfest-2023
, hacktoberfest
labels to the DagsHub repository.