Proxy for AWS
./mvnw -DskipTests clean install
Start a testing Trino AWS proxy, a Postgres database container, a Minio object store container and a Hive metastore container.
./mvnw exec:java -Dexec.mainClass=io.trino.aws.proxy.server.LocalServer -Dexec.classpathScope=test
Make note of the last few lines output to the console and copy:
In separate terminal...
> aws configure
-- enter the access and secret key
-- enter "us-east-1" for region
-- try AWS CLI commands
> aws --endpoint-url <endpoint from above> s3 ls
In separate terminal...
-- start a PySpark container and then enter commands as shown
> docker run -it spark:3.5.1-scala2.12-java17-python3-ubuntu /opt/spark/bin/pyspark
# lots of text will output here
# The Spark default context needs to be stopped and re-created to
# point at the Trino AWS proxy:
spark.stop()
spark = SparkSession\
.builder\
.appName("test")\
.config("hive.metastore.uris", "thrift://host.docker.internal:METASTORE-SERVER-PORT-GOES-HERE")\
.enableHiveSupport()\
.config("spark.hadoop.fs.s3a.endpoint", "ENDPOINT-GOES-HERE - REPLACE 127.0.0.1 with host.docker.internal")\
.config("spark.hadoop.fs.s3a.access.key", "ACCESS-KEY-GOES-HERE")\
.config("spark.hadoop.fs.s3a.secret.key", "SECRET-KEY-GOES-HERE")\
.config("spark.hadoop.fs.s3a.path.style.access", True)\
.config("spark.hadoop.fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")\
.config("spark.hadoop.fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider")\
.config("spark.hadoop.fs.s3a.connection.ssl.enabled", False)\
.getOrCreate()
# try spark sql commands
The HTTP credentials provider provides an option to include additional headers on requests sent to the HTTP service (e.g., for authentication).
These can be configured with credentials-provider.http.headers
. This config entry is formatted as a comma-separated list of header names and values, where each entry is in the format header-name:header-value
.
For instance, header1:value1,header2:value2
.
If a header name or value should contain a comma, these can be escaped by doubling them (,,
translates to a single comma in the literal header name or value, and is not treated as a separator).
E.g.: setting this config property to "x-api-key: xyz,,123, Authorization: key,,,,123"
results in 2 headers:
x-api-key
: with value xyz,123
Authorization
: with value key,,123