Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.35k stars 1.99k forks source link

[BUG] ClassNotFoundException: Failed to find data source: cosmos.oltp #33137

Open jainr opened 1 year ago

jainr commented 1 year ago

Describe the bug ClassNotFoundException: Failed to find data source: cosmos.oltp.

Exception or Stack Trace 2023-01-21 01:16:07.270 | ERROR | feathr.spark_provider._databricks_submission:wait_for_completion:290 - Feathr job has failed. Please visit this page to view error message: ***?o=5638037984879289#job/229669116997310/run/15335820 2023-01-21 01:16:07.270 | ERROR | feathr.spark_provider._databricks_submission:wait_for_completion:293 - Error Code: ClassNotFoundException: Failed to find data source: cosmos.oltp. Please find packages at http://spark.apache.org/third-party-projects.html Caused by: ClassNotFoundException: cosmos.oltp.DefaultSource 2023-01-21 01:16:07.271 | ERROR | feathr.spark_provider._databricks_submission:wait_for_completion:295 - at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:765) at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:819) at org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:1106) at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:341) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:312) at com.linkedin.feathr.offline.config.location.GenericLocationAdHocPatches$.writeDf(GenericLocation.scala:163) at com.linkedin.feathr.offline.config.location.GenericLocation.writeDf(GenericLocation.scala:58) at

Source - https://github.com/feathr-ai/feathr/actions/runs/3971935845/jobs/6810424495

To Reproduce Use azure.cosmos.spark package on databricks as part of maven jar and not install it directly on the cluster.

Code Snippet Add the code snippet that causes the issue.

Expected behavior It should work as expected and connect to CosmosDB. location.format.toLowerCase() match { case "cosmos.oltp" => // Ensure the database and the table exist before writing val endpoint = location.options.getOrElse("spark.cosmos.accountEndpoint", throw new FeathrException("Missing sparkcosmosaccountEndpoint")) val key = location.options.getOrElse("spark.cosmos.accountKey", throw new FeathrException("Missing sparkcosmosaccountKey")) val databaseName = location.options.getOrElse("spark.cosmos.database", throw new FeathrException("Missing sparkcosmosdatabase")) val tableName = location.options.getOrElse("spark.cosmos.container", throw new FeathrException("Missing sparkcosmoscontainer")) ss.conf.set("spark.sql.catalog.cosmosCatalog", "com.azure.cosmos.spark.CosmosCatalog") ss.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountEndpoint", endpoint) ss.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountKey", key) ss.sql(s"CREATE DATABASE IF NOT EXISTS cosmosCatalog.${databaseName};") ss.sql(s"CREATE TABLE IF NOT EXISTS cosmosCatalog.${databaseName}.${tableName} using cosmos.oltp TBLPROPERTIES(partitionKeyPath = '/id')")

Screenshots If applicable, add screenshots to help explain your problem.

Setup (please complete the following information):

If you suspect a dependency version mismatch (e.g. you see NoClassDefFoundError, NoSuchMethodError or similar), please check out Troubleshoot dependency version conflict article first. If it doesn't provide solution for the problem, please provide:

Additional context Add any other context about the problem here.

Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

ghost commented 1 year ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @kushagraThapar, @TheovanKraay

jainr commented 1 year ago

Related to this issue - https://github.com/Azure/azure-sdk-for-java/issues/32073

vishwas-neo commented 7 months ago

I am facing the same issue. Any updates on this ?