Open leaves12138 opened 1 year ago
please assigned to me tks, I want to try it
Any news?
A complete and clear example like the one below that I tried and it doesn't work would be great, does anyone know why?
from pyspark.sql import SparkSession
from pyspark.sql.types import *
from pyspark.sql import functions as F
from pyspark.sql import Window
# https://paimon.apache.org/docs/master/spark/quick-start/#preparation
spark = (SparkSession.builder
.appName('Paimon')
.config("spark.jars.packages", "org.apache.paimon:paimon-spark-3.3:0.7.0-incubating,"
"org.apache.paimon:paimon-s3:0.7.0-incubating"
)
# S3 / Minio
.config("spark.hadoop.fs.s3a.access.key", "XXXXXXXXX")
.config("spark.hadoop.fs.s3a.secret.key", "XXXXXXXXX")
.config("spark.hadoop.fs.s3a.endpoint", "http://minio.minio:9000")
.config("spark.hadoop.fs.s3a.path.style.access", True)
.config("spark.hadoop.fs.s3a.fast.upload", True)
.config("spark.hadoop.fs.s3a.multipart.size", 104857600)
.config("fs.s3a.connection.maximum", 100)
.config("spark.hadoop.fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
.config("spark.hadoop.fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider")
# Paimon
.config("spark.sql.catalog.paimon", "org.apache.paimon.spark.SparkCatalog")
.config("spark.sql.catalog.paimon.s3.access-key", "XXXXXXXX")
.config("spark.sql.catalog.paimon.s3.secret-key", "XXXXXXXXX")
.config("spark.sql.catalog.paimon.s3.endpoint", "http://minio.minio:9000")
.config("spark.sql.catalog.paimon.warehouse", "s3://lakehouse/paimon")
.config("spark.sql.extensions", "org.apache.paimon.spark.extensions.PaimonSparkSessionExtensions")
.getOrCreate()
)
spark version 3.3.0
Search before asking
Motivation
Docs about s3 storage is fuzzy, need a example explain how to link s3 storage.(We can use minio)
Solution
No response
Anything else?
No response
Are you willing to submit a PR?