lancedb / lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
https://lancedb.github.io/lance/
Apache License 2.0
3.97k stars 227 forks source link

Failed to get AWS credentials: an error occurred while loading credentials #2466

Open JosueCL11 opened 5 months ago

JosueCL11 commented 5 months ago

What a good day everyone, I'm trying to use lancedb in aws from a glue (which has read and write permissions to s3). But I'm getting this error with the following code:

Captura de pantalla 2024-06-13 a la(s) 12 13 47 p m

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import lancedb
import pandas as pd
import pyarrow as pa

args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

uri = "s3://bronze-layer-aura/temp/"
db = lancedb.connect(uri)

data = [
    {"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
    {"vector": [5.9, 26.5], "item": "bar", "price": 20.0},
]

tbl = db.create_table("my_table", data=data)

job.commit()

What am I doing wrong?

chebbyChefNEQ commented 5 months ago

could you check if aws sts get-caller-id returns correct IAM on the node you are running off of?