[BUG] Cannot read files into dataframe in Databricks 13.3 LTS Runtime 3.3.0 Spark

dinesh1512 commented 7 months ago

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

Current Behavior When running v2 excel pySpark code below in Databricks 13.3 LTS Runtime:

df = spark.read.format("excel") .option("header", True) .option("inferSchema", True) .load(fr"{folderpath}//.xlsx") display(df)

I receive the following error upon attempting to display or use the resulting dataframe:

AbstractMethodError: org.apache.spark.sql.execution.datasources.v2.FilePartitionReaderFactory.options()Lorg/apache/spark/sql/catalyst/FileSourceOptions;

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3) (10.178.42.202 executor 0): java.lang.AbstractMethodError: org.apache.spark.sql.execution.datasources.v2.FilePartitionReaderFactory.options()Lorg/apache/spark/sql/catalyst/FileSourceOptions;

This issue is same as https://github.com/crealytics/spark-excel/issues/682 that was addressed for older versions.

Expected Behavior

The resulting dataframe should display the data.

Steps To Reproduce

set the folderpath variable to a location containing excel files, and run the below python code in latest runtime of Databricks:

df = spark.read.format("excel") .option("header", True) .option("inferSchema", True) .load(fr"{folderpath}//.xlsx") display(df)

Environment

- Spark version: 3.4.1
- Spark-Excel version: 0.18.7
- OS: N/A
- Cluster environment

Anything else?

No response

github-actions[bot] commented 7 months ago

Please check these potential duplicates:

[#712] [BUG] Cannot read files into dataframe in Databricks 9.1 LTS Runtime 3.1.2 Spark (70.06%) If this issue is a duplicate, please add any additional info to the ticket with the most information and close this one.

github-actions[bot] commented 7 months ago

Please check these potential duplicates:

[#712] [BUG] Cannot read files into dataframe in Databricks 9.1 LTS Runtime 3.1.2 Spark (72.88%) If this issue is a duplicate, please add any additional info to the ticket with the most information and close this one.

nightscape commented 7 months ago

Please always try the newest version before creating issues. Closing this until the issue is reproduced with the newest version.

nightscape / spark-excel