Closed SubratPanigrahi closed 4 years ago
Hi, we have the same problem.
error: object sqldb is not a member of package com.microsoft.azure import com.microsoft.azure.sqldb.spark.connect._
When starting the cluster manually everything works - running the notebook manually as well as starting the job. When the cluster is started by the scheduler the error occurs in 90% of the cases.
Any ideas?
Does this error occur if you are using a data engineering cluster instead of an interactive cluster?
I have had this same problem with interactive clusters, and it occurs because Databricks installs libraries asynchronously to interactive clusters after cluster start. Therefore the cluster may run that import statement before azure-sqldb-spark has been installed.
So I think the solution is either to use data engineering clusters (if slow start time is acceptable) or create some logic to wait for cluster installs before running the imports. It is e.g. possible to poll cluster library installation status from Databricks API.
The cluster is the same, just once started manually, once by a scheduled job.
@rworbis if you monitor your interactive clusters "Libraries" tab status indicators at the same time the scheduled job is starting, you will probably notice that the failure will occur before the installation of azure-sqldb-spark has been completed.
If that is the case, this is a Databricks issue.
The suggestion from @tkasu are probably the best hint the OP can get on this context-specific issue. Closing the issue as it is unlikely that anyone from the community can provide any more prescriptive advice.
Hi All,
Our scheduled job in azure Databricks fails intermittently with the below error message. But it doesn't fail when we run manually. We have attached an Interactive Cluster to the notebook. Not sure why it fails intermittently.
error: object sqldb is not a member of package com.microsoft.azure import com.microsoft.azure.sqldb.spark.bulkcopy.BulkCopyMetadata
Any suggestion would be appreciated!!