Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
What a good day everyone, I'm trying to use lancedb in aws from a glue (which has read and write permissions to s3). But I'm getting this error with the following code:
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import lancedb
import pandas as pd
import pyarrow as pa
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
uri = "s3://bronze-layer-aura/temp/"
db = lancedb.connect(uri)
data = [
{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
{"vector": [5.9, 26.5], "item": "bar", "price": 20.0},
]
tbl = db.create_table("my_table", data=data)
job.commit()
What a good day everyone, I'm trying to use lancedb in aws from a glue (which has read and write permissions to s3). But I'm getting this error with the following code:
What am I doing wrong?