I'm trying to access ceph storage, located locally on OpenShift cluster, but I'm using:
spark.hadoop.fs.s3a.path.style.access=true
but when job is run I get:
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:com.amazonaws.AmazonClientException: Unable to execute HTTP request: test.ceph-nano-0: Name or service not known);
test - is a bucket name, I try to access it with "LOCATION 's3a://test/import'"
Steps to reproduce:
Create spark cluster with thrift server and CEPH as a storage.
2.Try to execute SparkSQL query and create external table, stored with s3 ceph storage.
Description:
I'm trying to access ceph storage, located locally on OpenShift cluster, but I'm using:
spark.hadoop.fs.s3a.path.style.access=true
but when job is run I get:
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:com.amazonaws.AmazonClientException: Unable to execute HTTP request: test.ceph-nano-0: Name or service not known);
test - is a bucket name, I try to access it with "LOCATION 's3a://test/import'"
Steps to reproduce:
Spark cluster details:
spec: customImage: 'quay.io/opendatahub/spark-cluster-image:2.4.3-h2.7' env:
Thrift server details:
apiVersion: v1 kind: Secret metadata: name: thriftserver-server-conf stringData: thrift-server.conf: |- spark.blockManager.port=42100 spark.cores.max=2 spark.driver.bindAddress=0.0.0.0 spark.driver.host=thriftserver.$(namespace).svc spark.driver.memory=2G spark.driver.port=42000 spark.executor.memory=2G spark.hadoop.datanucleus.rdbms.datastoreAdapterClassName=org.datanucleus.store.rdbms.adapter.PostgreSQLAdapter spark.hadoop.datanucleus.schema.autoCreateAll=true spark.hadoop.fs.s3a.endpoint=$(s3_endpoint_url) spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.EnvironmentVariableCredentialsProvider spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem spark.hadoop.fs.s3a.path.style.access=true spark.hadoop.javax.jdo.option.ConnectionDriverName=org.postgresql.Driver spark.hadoop.javax.jdo.option.ConnectionPassword=$(database_password) spark.hadoop.javax.jdo.option.ConnectionURL=jdbc:postgresql://thriftserver-db.$(namespace).svc:5432/$(database_name) spark.hadoop.javax.jdo.option.ConnectionUserName=$(database_user) spark.sql.adaptive.enabled=true spark.sql.thriftServer.incrementalCollect=true spark.sql.warehouse.dir=/spark-warehouse
Spark Operator 1.1.0 used with Open Data Hub