-
Mehrere Stops haben aufgrund von Stellen- oder Tippfehlern massive Lageabweichungen:
```
"de:08327:7965:1:1","","Eutingen im Gäu Bahnhof",,"0.087825000000","0.484790000000",0,"de:08237:7965",,"","…
-
Could you explain how you concluded that the difference between the similarity score of AI-generated images with text and that of real images with text would increase when mixing AI-generated and real…
-
### Context
We have been building a compendia for human, and it has required more RAM than has made sense. @kurtwheeler found that we were trying to build an initial matrix for filtering with more …
-
Dask can perform groupby operations relatively well for columns with low cardinality, but performance seems to degrade significantly for columns with more distinct values.
Let's look at the h2o gro…
-
## Describe the bug
The code _works_ at the moment. But it does something that's slow, CPU-intensive, and memory-intensive: it passes massive `xr.Datasets` between processes.
The main process does…
-
**Is your feature request related to a problem? Please describe.**
Having a massive catalogue of ids/entities in the feature store is essential for the offline store and creating large historical dat…
-
Hello @Bostoncake 🤗
I'm Niels and I work on computer vision at Hugging Face. I see your paper is accepted as ECCV Poster, congratulations! I will index it here https://huggingface.co/papers/2403.09…
-
[Dataset](https://schema.org/Dataset) is pretty vague, it can cover anything from .zip files of .wavs of social science interviews, application-specific on-disk file formats, etc etc. In theory we cou…
-
**Why is this needed**:
1. To improve query performance on (massive) geo data, bounding-box filters of datasources should be supported.
Many databases support efficient filters for geo data,…
tfenz updated
9 months ago
-
Hi! I am using KenLM on massive corpora of text to explore the properties of those datasets (i.e., Common Crawl, Wikipedia, etc.).
I am not trying to use KenLM to generate new text; I want to expl…