Closed sherpan closed 2 days ago
Hey, I think pandas has a few toy datasets that might be relevant or perhaps some HuggingFace datasets could be used for it as well for these use-cases.
We do have in our future roadmap some ideas around allowing users to make their datasets public that could solve this as well. For now I'll close as won't fix
Proposal summary
Would be nice to have some out-of-the box Toy datasets for users to play around and get started with the platform quicker. Something similar to
load_iris
in sklearn packages. To start, I was thinking of having a 15 question sample dataset for Machine Translation and ChatBots.Not sure if the user should be expected to still add the dataset to their workspace or maybe users start out with the sample datasets already there and can just call
get_dataset
I have the json file we could use for the MT Sample dataset. french_phrases.json
Motivation
Improve the developer experience so they can run evaluations without needing their own dataset