pinecone-io / pinecone-datasets

An open-source dataset library for pre-embedded dataset: create your own data catalog, or use Pinecone's public datasets.
https://pinecone-io.github.io/pinecone-datasets/
32 stars 12 forks source link

Switch to ray datasets #13

Closed miararoy closed 1 year ago

miararoy commented 1 year ago

Problem

enhancement

Solution

moving away from Pandas and Polars as loaders to utilize Ray parallel loading capability and as an inermediate step towards dataset save

Type of Change

Test Plan

all unit(=interg.) tests were edited and passing