Open cic1988 opened 9 months ago
Hello experts,
I followed the protocol example to build the reference server. The server generated the presigned URL when table/query endpoint is called.
table/query
Assumed that my table_url is profile.json#share.schema.table.
table_url
profile.json#share.schema.table
By using df = delta_sharing.load_as_pandas(table_url, limit=3) it loads the data well. But it has failed if I use load_as_spark.
df = delta_sharing.load_as_pandas(table_url, limit=3)
load_as_spark
Following code:
from pyspark.sql import SparkSession spark = SparkSession.builder.appName("Delta Share Demo") \ .config('spark.jars', 'packages/haddop-azure-3.3.6.jar,packages/delta-sharing-spark_2.12-0.6.4.jar') \ .getOrCreate() ... import delta_sharing df = delta_sharing.load_as_spark(table_url) df.limit(2).select("path").show()
In the error, it shows:
java.lang.RuntimeException: delta-sharing:/profile.json%23share.schema.table/123/25169076 is not a Parquet file. Expected magic number at tail, but found [0, 20, 14, 55]
Have you seen the error before?
@cic1988 sorry haven't seen it before. Is this still happening? Do you have a full stack trace?
Hello experts,
I followed the protocol example to build the reference server. The server generated the presigned URL when
table/query
endpoint is called.Assumed that my
table_url
isprofile.json#share.schema.table
.By using
df = delta_sharing.load_as_pandas(table_url, limit=3)
it loads the data well. But it has failed if I useload_as_spark
.Following code:
In the error, it shows:
Have you seen the error before?