Open DataNoob0723 opened 1 month ago
To save and load a PropertyGraphIndex
using AWS S3, you can follow these steps:
Set up S3FS: Ensure you have the s3fs
package installed and configure it with your AWS credentials.
Save the Index:
import s3fs
import os
# Set up s3fs
AWS_KEY = os.environ["AWS_ACCESS_KEY_ID"]
AWS_SECRET = os.environ["AWS_SECRET_ACCESS_KEY"]
R2_ACCOUNT_ID = os.environ["R2_ACCOUNT_ID"]
s3 = s3fs.S3FileSystem(
key=AWS_KEY,
secret=AWS_SECRET,
endpoint_url=f"https://{R2_ACCOUNT_ID}.r2.cloudflarestorage.com",
s3_additional_kwargs={"ACL": "public-read"},
)
index.set_index_id("property_graph_index")
index.storage_context.persist(persist_dir="llama-index/storage_demo", fs=s3)
Load the Index:
from llama_index.core import StorageContext, load_index_from_storage
index_from_s3 = load_index_from_storage(
StorageContext.from_defaults(persist_dir="llama-index/storage_demo", fs=s3),
index_id="property_graph_index",
)
Ensure you have the necessary S3 permissions (s3:ListBucket
, s3:PutObject
, s3:GetObject
) to perform these operations [1].
To continue talking to Dosu, mention @dosu.
Why do I need R2_ACCOUNT_ID here?
The R2_ACCOUNT_ID
is needed to construct the endpoint_url
for the S3-compatible storage service. This URL is essential for s3fs.S3FileSystem
to connect to the correct storage endpoint, as shown in the setup example:
s3 = s3fs.S3FileSystem(
key=AWS_KEY,
secret=AWS_SECRET,
endpoint_url=f"https://{R2_ACCOUNT_ID}.r2.cloudflarestorage.com",
s3_additional_kwargs={"ACL": "public-read"},
)
This ensures that the connection is made to the right storage service [1].
To continue talking to Dosu, mention @dosu.
Can I do it without R2_ACCOUNT_ID?
Yes, you can configure s3fs.S3FileSystem
without the R2_ACCOUNT_ID
if you're using standard AWS S3. You would simply omit the endpoint_url
or set it to None
:
s3 = s3fs.S3FileSystem(
key=AWS_KEY,
secret=AWS_SECRET,
endpoint_url=None # Use None for standard AWS S3
)
This setup will connect directly to AWS S3 without needing a custom endpoint URL [1].
To continue talking to Dosu, mention @dosu.
Question Validation
Question
I have using PropertyGraphIndex. How can I save and load PropertyGraphIndex from AWS s3?