-
Hi @cs-mshah,
Niels here from the open-source team at Hugging Face. I discovered your work as it was submitted on AK's daily papers: https://huggingface.co/papers/2409.14677. The paper page lets pe…
-
We recently discovered https://safetyprompts.com/, which has so many datasets!
We need help going through the website and creating a list of relevant datasets. A relevant dataset is one which conta…
-
```
from datasets import load_dataset
ds = load_dataset(
"speechcolab/gigaspeech",
"xl",
split="train",
trust_remote_code=True,
streaming=True,
)
```
As shown in the code…
-
## Detailed Description
It would be great to be able to load the ICON data from HF in our nwp open dataset
## Context
- https://huggingface.co/datasets/openclimatefix/dwd-icon-eu
- Discussion …
-
The section https://huggingface.co/docs/hub/datasets-adding#large-scale-datasets is somewhat small.
I think we could add content copied from https://huggingface.co/docs/hub/repositories-recommendat…
-
Is it updated monthly? Do you have an estimated release date for the September dataset?
-
Any plans for Huggingface `datasets` integration?
Instead of using pickled dictionary, probably it is better practice to use `arrow` or `parquet` format. It should be pretty easy to convert to Hugg…
-
Following up on https://github.com/iterative/dvc/issues/10313 and related new features specifying `datasets` as dependencies, we can add more types of supported datasets:
- [delta lake](https://itera…
-
### Describe the feature
- https://huggingface.co/datasets/openai/MMMLU
### Will you implement it?
- [ ] I would like to implement this feature and create a PR!
-
right now it assumes the splits.json exists but it doesn't show how to download splits.json: https://huggingface.co/datasets/McGill-NLP/WebLINX-full/blob/main/splits.json