databrickslabs / overwatch

Capture deep metrics on one or all assets within a Databricks workspace
Other
221 stars 60 forks source link

Issue in module 2010 Silver_JobsStatus #1036

Closed james-d-cole closed 9 months ago

james-d-cole commented 10 months ago

Overwatch Version com.databricks.labs:overwatch_2.12:0.7.2.0.4

Describe the bug Since upgrading from 0.7.1.3 to 0.7.2.0.4 module 2010 (Silver_JobsStatus) is failing with the message: "API CALL Failed [AMBIGUOUS_REFERENCE] Reference Message is ambiguous, could be: [Message, Message].” Your 0.7.2.0.4 release notes include a bug fix for a merge issue in this module, is this the same issue that is still happening?

sriram251-code commented 10 months ago

HI @james-d-cole you are right we have a bug for the same. https://github.com/databrickslabs/overwatch/issues/911

We have fixed it and the fix will be available on the next release 0.7.2.1 tentative date of release is early next week.

mohanbaabu1996 commented 9 months ago

Hi @james-d-cole,

New version 0721 has been released, please follow the documentation. Let us know if you've any doubts. Thank you!

james-d-cole commented 9 months ago

Hi @mohanbaabu1996 - I have updated to 0721 but I am still getting the same error as above. Full error below. This is the only module that is failing for me.

FAILED --> ERROR: FAILED: 2010-Silver_JobsStatus Module: Failed

Reference 'Message' is ambiguous, could be: Message, Message. org.apache.spark.sql.AnalysisException: Reference 'Message' is ambiguous, could be: Message, Message. at org.apache.spark.sql.catalyst.expressions.package$AttributeSeq.resolve(package.scala:408) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveChildren(LogicalPlan.scala:116) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$resolveExpressionByPlanChildren$1(Analyzer.scala:2660) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$resolveExpression$2(Analyzer.scala:2586) at org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:71) at org.apache.spark.sql.catalyst.analysis.Analyzer.innerResolve$1(Analyzer.scala:2593) at org.apache.spark.sql.catalyst.analysis.Analyzer.resolveExpression(Analyzer.scala:2613) at org.apache.spark.sql.catalyst.analysis.Analyzer.resolveExpressionByPlanChildren(Analyzer.scala:2666) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$17.$anonfun$applyOrElse$102(Analyzer.scala:2368) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:204) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:99) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:204) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:215) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:220) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:220) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:225) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:355) at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:225) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$17.applyOrElse(Analyzer.scala:2368) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$17.applyOrElse(Analyzer.scala:2164) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:139) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:99) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:139) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:354) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:135) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:131) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:31) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.apply(Analyzer.scala:2164) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.apply(Analyzer.scala:2145) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$3(RuleExecutor.scala:216) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:216) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:213) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:205) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:205) at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:331) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:324) at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:231) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:324) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:252) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:184) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:154) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:184) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:304) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:361) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:303) at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:147) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:340) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$3(QueryExecution.scala:337) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:763) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:337) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:985) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:334) at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:141) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:141) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:133) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:98) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:985) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) at org.apache.spark.sql.Dataset.withPlan(Dataset.scala:4327) at org.apache.spark.sql.Dataset.select(Dataset.scala:1630) at com.databricks.labs.overwatch.utils.asofJoin$.addColumnsFromOtherDF(asofJoin.scala:50) at com.databricks.labs.overwatch.utils.asofJoin$.lookupWhenExec(asofJoin.scala:228) at com.databricks.labs.overwatch.utils.BaseTSDF.lookupWhen(TSDF.scala:134) at com.databricks.labs.overwatch.pipeline.WorkflowsTransforms$.jobStatusLookupJobMeta(WorkflowsTransforms.scala:193) at com.databricks.labs.overwatch.pipeline.SilverTransforms.$anonfun$dbJobsStatusSummary$1(SilverTransforms.scala:1218) at org.apache.spark.sql.Dataset.transform(Dataset.scala:3165) at com.databricks.labs.overwatch.pipeline.SilverTransforms.dbJobsStatusSummary(SilverTransforms.scala:1218) at com.databricks.labs.overwatch.pipeline.SilverTransforms.dbJobsStatusSummary$(SilverTransforms.scala:1172) at com.databricks.labs.overwatch.pipeline.Silver.dbJobsStatusSummary(Silver.scala:7) at com.databricks.labs.overwatch.pipeline.Silver.$anonfun$appendJobStatusProcess$2(Silver.scala:258) at org.apache.spark.sql.Dataset.transform(Dataset.scala:3165) at com.databricks.labs.overwatch.pipeline.ETLDefinition.$anonfun$executeETL$1(ETLDefinition.scala:30) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at com.databricks.labs.overwatch.pipeline.ETLDefinition.executeETL(ETLDefinition.scala:28) at com.databricks.labs.overwatch.pipeline.Module.execute(Module.scala:388) at com.databricks.labs.overwatch.pipeline.Silver.$anonfun$executeModules$1(Silver.scala:421) at scala.collection.immutable.List.foreach(List.scala:431) at com.databricks.labs.overwatch.pipeline.Silver.executeModules(Silver.scala:404) at com.databricks.labs.overwatch.pipeline.Silver.run(Silver.scala:431) at com.databricks.labs.overwatch.MultiWorkspaceDeployment.startSilverDeployment(MultiWorkspaceDeployment.scala:231) at com.databricks.labs.overwatch.MultiWorkspaceDeployment.$anonfun$executePipelines$1(MultiWorkspaceDeployment.scala:455) at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659) at scala.util.Success.$anonfun$map$1(Try.scala:255) at scala.util.Success.map(Try.scala:213) at scala.concurrent.Future.$anonfun$map$1(Future.scala:292) at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33) at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)

mohanbaabu1996 commented 9 months ago

Hi @james-d-cole

Can you share me the pipeline report to validate.

select * from ETL_DB.pipeline_report order by Pipeline_SnapTs desc

Regards, Mohan Baabu

james-d-cole commented 9 months ago

Hi @mohanbaabu1996

The only way I have been able to resolve this error is by removing the field 'Message' from the table audit_log_bronze, which is a dependency for the 2010 module. This field only contained nulls. I wonder if we have missed a schema update script somewhere along the way?

Are you able to share the data model of the ETL layers? The documentation only includes the gold layer model.

I have also attached the output you requested.

mohanbaabu1996 commented 9 months ago

Thanks for the report, will check and let you know

Here is the link for the module flow

mohanbaabu1996 commented 9 months ago

image.

Oh after removing the "message" from the audit_log_bronze and it worked fine, my understanding is correct ?

james-d-cole commented 9 months ago

Hi @mohanbaabu1996,

That did seem to fix it for a couple of runs. However, I have now started to get the same error again. The field Message had reappeared in the table audit_log_bronze, but removing it has not fixed the error this time.

The module flow doesn't give me what I am looking for. I was hoping to get a full data model for all layers to check whether our schema is as you would expect.

I have attached an updated report.

mohanbaabu1996 commented 9 months ago

Hi @james-d-cole,

Can you please create a support ticket from the cloud and provide the number to us, so that we can login and troubleshoot.

Also can you please remove the report from your old message.

Thanks!

james-d-cole commented 9 months ago

Hi @mohanbaabu1996. Case number is 00376224. Thank you for your help!

mohanbaabu1996 commented 9 months ago

Hi @james-d-cole,

Thanks for the ticket. Did you add the message column in audit_log_bronze table ?

image