Closed Kishor-Radhakrishnan closed 1 year ago
Thank you for sharing the logs! Looking into them, it seems the source of the error is the function app you're pointing to. For example, on line 4307 of the attached log4j output, you'll see the error message: ERROR EventEmitter: Could not emit lineage w/ exception java.net.UnknownHostException: functionappkae2.azurewebsites.net: Name or service not known
.
Can you ensure your function app is running and accessible from the databricks workspace? Please see the following section in our troubleshooting guide for details on what to check.
identified the issue . Pls close issue
We are missing lineage info for few notebooks. I am pasting error from log4j and attaching logs
23/04/04 13:00:14 WARN RddExecutionContext: Unable to access job conf from RDD java.lang.NoSuchFieldException: Field is not instance of HadoopMapRedWriteConfigUtil at io.openlineage.spark.agent.lifecycle.RddExecutionContext.lambda$setActiveJob$0(RddExecutionContext.java:117) at java.util.Optional.orElseThrow(Optional.java:290) at io.openlineage.spark.agent.lifecycle.RddExecutionContext.setActiveJob(RddExecutionContext.java:115) at java.util.Optional.ifPresent(Optional.java:159) at io.openlineage.spark.agent.OpenLineageSparkListener.lambda$onJobStart$9(OpenLineageSparkListener.java:168) at java.util.Optional.ifPresent(Optional.java:159) at io.openlineage.spark.agent.OpenLineageSparkListener.onJobStart(OpenLineageSparkListener.java:165) at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:37) at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:118) at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:102) at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105) at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105) at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100) at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96) at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1623) at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96) 23/04/04 13:00:14 INFO RddExecutionContext: Found job conf from RDD Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-rbf-default.xml, hdfs-site.xml, hdfs-rbf-site.xml 23/04/04 13:00:14 INFO RddExecutionContext: Found output path null from RDD MapPartitionsRDD[16] at $anonfun$execute$3 at FrameProfiler.scala:80 23/04/04 13:00:14 INFO RddExecutionContext: RDDs are empty: skipping sending OpenLineage event
log4j-active (4).txt