MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.29k stars 21.47k forks source link

Sparklyr code is not working for Apache Spark version 3.3 #115795

Closed bernobfer closed 1 year ago

bernobfer commented 1 year ago

When using a Spark Pool with version 3.3, the following piece of code provided by the documentation:

%%sparkr
spark_version = sparkR.version()
config = spark_config()
sc = spark_connect(master = 'yarn', version = spark_version, spark_home = '/opt/spark', config = config)

gives the following error:

Error: [1] 'Error: org.apache.spark.SparkException: The Spark SQL phase analysis failed with an internal error. Please, fill a bug report in, and provide the full stack trace.....

Basically any operation that includes the object sc, fails with the above error, with the exception of spark_connection_is_open(sc) which yields the value TRUE.

By adding the argument method='synapse', like this:

%%sparkr
spark_version = sparkR.version()
config = spark_config()
sc = spark_connect(master = 'yarn', version = spark_version, spark_home = '/opt/spark', config = config, method='synapse')

The error does not appear any longer. However this is not reflected on the documentation.


Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

AjayBathini-MSFT commented 1 year ago

@bernobfer Thanks for your feedback! We will investigate and update as appropriate.

RamanathanChinnappan-MSFT commented 1 year ago

@bernobfer Thanks for reporting this! We have created a PR for this issue and the changes should go live soon.