mspnp / spark-monitoring

Monitoring Azure Databricks jobs
MIT License
213 stars 178 forks source link

Kusto query contain columns which don’t exist in Azure log analytics data #230

Open lukascervenka opened 11 months ago

lukascervenka commented 11 months ago

Hello, I use Databricks Runtime Version 13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12 and send data to Azure Log Analytics.

Kusto queries in branch l4jv2m, at file "perftools/deployment/loganalytics/logAnalyticsDeploy.json" didn't work. For example query "Task Throughput (Sum Of Tasks Per Stage)" have a column "Properties_spark_app_id_s" which exist but in different format "properties_spark_app_id_s" Another column " Stage_Info_Stage_ID_d" dont exist. May by it was replaced by "stageIds_s"?

let result=SparkListenerEvent_CL | where Event_s contains "SparkListenerStageSubmitted" | extend metricsns=columnifexists("Properties_spark_metrics_namespace_s",Properties_spark_app_id_s) | extend apptag=iif(isnotempty(metricsns),metricsns,Properties_spark_app_id_s) | project Stage_Info_Stage_ID_d,Stage_Info_Stage_Name_s,Stage_Info_Submission_Time_d,Event_s,TimeGenerated,Properties_spark_databricks_clusterUsageTags_clusterName_s,apptag | order by TimeGenerated asc nulls last | join kind= inner ( SparkListenerEvent_CL
| where Event_s contains "SparkListenerTaskEnd"
| where Task_End_Reason_Reason_s contains "Success"
| project Stage_ID_d,Task_Info_Task_ID_d, TaskEvent=Event_s,TimeGenerated ) on $left.Stage_Info_Stage_ID_d == $right.Stage_ID_d;

result | extend slice=strcat("#TasksCompleted ",Properties_spark_databricks_clusterUsageTags_clusterName_s,"-",apptag," ",Stage_Info_Stage_Name_s) | summarize count(TaskEvent) by bin(TimeGenerated,1m),slice | order by TimeGenerated asc nulls last

return 'extend' operator: Failed to resolve scalar expression named 'Properties_spark_app_id_s'

msetodev commented 11 months ago

@lukascervenka Can I see your /src/spark-listeners/src/main/scala/org/apache/spark/listeners/UnifiedSparkListener.scala file? I cannot get mine to compile as the sparkEventToJson function does not exist in the latest Spark version.

lukascervenka commented 11 months ago

Hello @msetodev I use branch l4jv2m and this file doesn't exist there. It exist only in main branch. image