kraina-ai / srai

Spatial Representations for Artificial Intelligence - a Python library toolkit for geospatial machine learning focused on creating embeddings for downstream tasks
https://kraina-ai.github.io/srai/
Apache License 2.0
205 stars 15 forks source link

Add datasets / benchmarks module #365

Open RaczeQ opened 10 months ago

RaczeQ commented 10 months ago

The library focuses on creating embeddings for downstream tasks, but we have a hard time benchmarking different embedders and models.

We would like to have example datasets that could be used for benchmarking the models. This module could be connected to HuggingFace API and download datasets from Kraina directory.

Example datasets to be included:

Calychas commented 10 months ago
  1. Airbnb X -> points, region ids for different resolutions (h3?), airbnb features Y -> target (mean rent cost, ...)

  2. BSS X -> region ids for different resolutions Y -> is there a bike station in the region