weaviate / weaviate-python-client

A python native client for easy interaction with a Weaviate instance.
https://weaviate.io/developers/weaviate/current/client-libraries/python.html
BSD 3-Clause "New" or "Revised" License
160 stars 69 forks source link

[DX] Save a Collection's objects into a local JSON file #1172

Open CShorten opened 2 months ago

CShorten commented 2 months ago

What?

Get the objects from a Collection in Weaviate and to a local JSON file.

Why?

This helps with quick experimentation, maybe other use cases.

How?

Thinking this API:

collection = client.collections.get("WineReview")

collection.to_json(savepath="wine_reviews.json", num_samples=100) # returns bool success/fail

That pretty much just wraps the cursor API,

collection = client.collections.get("WineReview")

for item in collection.iterator():
    print(item.uuid, item.properties)

And then dumps the JSON file with the savepath.

Maybe also makes the json Python library another dependency of Weaviate's Python Client (if not already).

CShorten commented 1 month ago

Hey team, this notebook contains some more investigations into this idea of Weaviate Collections <> HuggingFace Dataset Hub or JSON files - https://github.com/weaviate/recipes/blob/main/weaviate-features/crud-apis/research/weaviate-collection-interoperability.ipynb