-
Now we have a CLI tool to convert a script dataset to a data-only one (parquet). See https://github.com/huggingface/datasets/releases/tag/2.19.0
We should reflect that in all the error messages f…
-
# Introduction
We downloaded the [Datacomp 1B set](https://huggingface.co/datasets/mlfoundations/datacomp_1b).
For verification, we only kept an image if its SHA256 checksum of the bytes matches wit…
-
In the framework of TURTLE - a project under the UN Ocean Decade DITTO Programme - we'll use this issue to scope out and execute a "hello world" interoperability exercise between at least two (more we…
-
Currently we have urls scan implemented for first 100K rows [here ](https://github.com/huggingface/datasets-server/blob/main/services/worker/src/worker/job_runners/split/opt_in_out_urls_scan_from_stre…
-
How to decide token at first?
-
Rerun looks great! I have many hours of 1+ kHz IMU data to analyze, so I am hoping rerun can help with that. I know it is early days of support for the time series visualizations support for kHz data,…
-
### MLRun Version checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the latest version of MLRun CE.
### Reproducible Example
```…
-
Currently, we only detect and report "audio", "image" and "text".
Ideally, we would have:
See https://github.com/huggingface/moon-landing/pull/9352#discussion_r1634909052 (internal)
--…
-
I'd like to propose that we evaluate the feasibility to support the faster [Arrow](https://arrow.apache.org/)-based data format.
-
### `brew doctor` output
```shell
Please note that these warnings are just used to help the Homebrew maintainers
with debugging if you file an issue. If everything you use Homebrew for is
workin…