Wondering about this approach from hugging face to managing data
Each dataset is a Git repository, equipped with the necessary scripts to download the data and generate splits for training, evaluation, and testing. For information on how a dataset repository is structured, refer to the Structure your repository guide. Following the supported repo structure will ensure that your repository will have a preview on its dataset page on the Hub.
This would in turn though require that
we had an ability to mirror external repos into DSH (it is possible with Gitea ?w gitlab)
we create synthetic/demo data externally in the chimera org (work with ATI)
Wondering about this approach from hugging face to managing data
This would in turn though require that