-
### What happened + What you expected to happen
When using `override_num_blocks` with non-streaming Hugging Face datasets, the parameter is ignored. The current suggested workaround is to add a `.rep…
-
Movie and service reviews, so more "natural" language
(a scale from 1-5) https://huggingface.co/datasets/Yelp/yelp_review_full
(binary pos/neg) https://huggingface.co/datasets/stanfordnlp/imdb
…
-
I have found 2 examples of the modality detection code failing to recognize modalities in text and image datasets using the Webdataset format:
* https://huggingface.co/datasets/ProGamerGov/synthe…
-
Hi,
Niels here from the open-source team at Hugging Face. I discovered your work through the paper page: https://huggingface.co/papers/2408.15980.
Thanks for making the models and dataset availa…
-
https://huggingface.co/datasets/Hiraishin/RHB-PDF
Please verify the images.
Will extract the contents soon.
-
Hi,
Niels here from the open-source team at Hugging Face. I discovered your work through the paper page: https://huggingface.co/papers/2405.19707 (feel free to claim the paper so that it appears un…
-
**Is your feature request related to a problem? Please describe.**
Was working on https://github.com/ggerganov/llama.cpp/pull/8875 to integrate some changes to how we interpret parent models and da…
-
### Describe the bug
The cached location of datasets is variant depending on how you download them from Huggingface:
1. Download using the CLI:
```bash
> huggingface-cli download 'wikimedia/wikiso…
-
Following up on https://github.com/iterative/dvc/issues/10313 and related new features specifying `datasets` as dependencies, we can add more types of supported datasets:
- [delta lake](https://itera…
-
Hi there,
Congrats on releasing the paper and dataset! Niels here from the open-source team at Hugging Face. I was wondering whether you would be up for making your dataset as a dedicated dataset r…