Open jainr opened 1 year ago
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @kushagraThapar, @TheovanKraay
Related to this issue - https://github.com/Azure/azure-sdk-for-java/issues/32073
I am facing the same issue. Any updates on this ?
Describe the bug ClassNotFoundException: Failed to find data source: cosmos.oltp.
Exception or Stack Trace 2023-01-21 01:16:07.270 | ERROR | feathr.spark_provider._databricks_submission:wait_for_completion:290 - Feathr job has failed. Please visit this page to view error message: ***?o=5638037984879289#job/229669116997310/run/15335820 2023-01-21 01:16:07.270 | ERROR | feathr.spark_provider._databricks_submission:wait_for_completion:293 - Error Code: ClassNotFoundException: Failed to find data source: cosmos.oltp. Please find packages at http://spark.apache.org/third-party-projects.html Caused by: ClassNotFoundException: cosmos.oltp.DefaultSource 2023-01-21 01:16:07.271 | ERROR | feathr.spark_provider._databricks_submission:wait_for_completion:295 - at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:765) at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:819) at org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:1106) at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:341) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:312) at com.linkedin.feathr.offline.config.location.GenericLocationAdHocPatches$.writeDf(GenericLocation.scala:163) at com.linkedin.feathr.offline.config.location.GenericLocation.writeDf(GenericLocation.scala:58) at
Source - https://github.com/feathr-ai/feathr/actions/runs/3971935845/jobs/6810424495
To Reproduce Use azure.cosmos.spark package on databricks as part of maven jar and not install it directly on the cluster.
Code Snippet Add the code snippet that causes the issue.
Expected behavior It should work as expected and connect to CosmosDB. location.format.toLowerCase() match { case "cosmos.oltp" => // Ensure the database and the table exist before writing val endpoint = location.options.getOrElse("spark.cosmos.accountEndpoint", throw new FeathrException("Missing sparkcosmosaccountEndpoint")) val key = location.options.getOrElse("spark.cosmos.accountKey", throw new FeathrException("Missing sparkcosmosaccountKey")) val databaseName = location.options.getOrElse("spark.cosmos.database", throw new FeathrException("Missing sparkcosmosdatabase")) val tableName = location.options.getOrElse("spark.cosmos.container", throw new FeathrException("Missing sparkcosmoscontainer")) ss.conf.set("spark.sql.catalog.cosmosCatalog", "com.azure.cosmos.spark.CosmosCatalog") ss.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountEndpoint", endpoint) ss.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountKey", key) ss.sql(s"CREATE DATABASE IF NOT EXISTS cosmosCatalog.${databaseName};") ss.sql(s"CREATE TABLE IF NOT EXISTS cosmosCatalog.${databaseName}.${tableName} using cosmos.oltp TBLPROPERTIES(partitionKeyPath = '/id')")
Screenshots If applicable, add screenshots to help explain your problem.
Setup (please complete the following information):
If you suspect a dependency version mismatch (e.g. you see
NoClassDefFoundError
,NoSuchMethodError
or similar), please check out Troubleshoot dependency version conflict article first. If it doesn't provide solution for the problem, please provide:mvn dependency:tree -Dverbose
)Additional context Add any other context about the problem here.
Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report