AbsaOSS / spline-spark-agent

Spline agent for Apache Spark
https://absaoss.github.io/spline/
Apache License 2.0
183 stars 93 forks source link

EC2MetadataUtils: Unable to retrieve the requested metadata (/latest/meta-data). The requested metadata is not found at http://169.254.169.254/latest/meta-data #614

Closed rupesh3020 closed 1 year ago

rupesh3020 commented 1 year ago

Hi Team,

We are getting below issue and I am not sure what is causing this. Can you please help me to understand what is going wrong here?

23/02/24 10:37:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.139.0.4:37634) with ID 0,  ResourceProfileId 0
23/02/24 10:37:34 WARN EC2MetadataUtils: Unable to retrieve the requested metadata (/latest/meta-data). The requested metadata is not found at http://169.254.169.254/latest/meta-data
com.amazonaws.SdkClientException: The requested metadata is not found at http://169.254.169.254/latest/meta-data
    at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:89)
    at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:70)
    at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:75)
    at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:62)
    at com.amazonaws.util.EC2MetadataUtils.getItems(EC2MetadataUtils.java:400)
    at com.amazonaws.util.EC2MetadataUtils.getItems(EC2MetadataUtils.java:380)
    at com.databricks.s3a.logging.InstanceMetadataServiceHelper$.$anonfun$isAws$1(InstanceMetadataServiceHelper.scala:16)
    at scala.util.Try$.apply(Try.scala:213)
    at com.databricks.s3a.logging.InstanceMetadataServiceHelper$.isAws$lzycompute(InstanceMetadataServiceHelper.scala:16)
    at com.databricks.s3a.logging.InstanceMetadataServiceHelper$.isAws(InstanceMetadataServiceHelper.scala:15)
    at com.databricks.backend.common.util.HadoopFSUtil$.setDefaultS3Configuration(HadoopFSUtil.scala:133)
    at com.databricks.backend.common.util.HadoopFSUtil$.createConfiguration(HadoopFSUtil.scala:100)
    at com.databricks.backend.common.util.HadoopFSUtil$.createConfiguration(HadoopFSUtil.scala:41)
    at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2Factory.getHadoopConfiguration(DatabricksFileSystemV2Factory.scala:159)
    at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2Factory.createFileSystem(DatabricksFileSystemV2Factory.scala:43)
    at com.databricks.backend.daemon.data.filesystem.MountEntryResolver.$anonfun$resolve$1(MountEntryResolver.scala:67)
    at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:395)
    at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:484)
    at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:504)
    at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261)
    at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258)
    at com.databricks.common.util.locks.LoggedLock$.withAttributionContext(LoggedLock.scala:73)
    at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305)
    at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297)
    at com.databricks.common.util.locks.LoggedLock$.withAttributionTags(LoggedLock.scala:73)
    at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:479)
    at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:404)
    at com.databricks.common.util.locks.LoggedLock$.recordOperationWithResultTags(LoggedLock.scala:73)
    at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:395)
    at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:367)
    at com.databricks.common.util.locks.LoggedLock$.recordOperation(LoggedLock.scala:73)
    at com.databricks.common.util.locks.LoggedLock$.withLock(LoggedLock.scala:120)
    at com.databricks.common.util.locks.PerKeyLock.withLock(PerKeyLock.scala:36)
    at com.databricks.backend.daemon.data.filesystem.MountEntryResolver.resolve(MountEntryResolver.scala:64)
    at com.databricks.backend.daemon.data.client.DBFSV2.$anonfun$initialize$1(DatabricksFileSystemV2.scala:75)
    at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:395)
    at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:484)
    at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:504)
    at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261)
    at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258)
    at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionContext(DatabricksFileSystemV2.scala:510)
    at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305)
    at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297)
    at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionTags(DatabricksFileSystemV2.scala:510)
    at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:479)
    at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:404)
    at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.recordOperationWithResultTags(DatabricksFileSystemV2.scala:510)
    at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:395)
    at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:367)
    at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.recordOperation(DatabricksFileSystemV2.scala:510)
    at com.databricks.backend.daemon.data.client.DBFSV2.initialize(DatabricksFileSystemV2.scala:63)
    at com.databricks.backend.daemon.data.client.DatabricksFileSystem.initialize(DatabricksFileSystem.scala:230)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)
    at za.co.absa.spline.harvester.qualifier.HDFSPathQualifier.<init>(HDFSPathQualifier.scala:23)
    at za.co.absa.spline.agent.SplineAgent$.create(SplineAgent.scala:65)
    at za.co.absa.spline.harvester.SparkLineageInitializer.createListener(SparkLineageInitializer.scala:162)
    at za.co.absa.spline.harvester.SparkLineageInitializer.$anonfun$createListener$6(SparkLineageInitializer.scala:139)
    at za.co.absa.spline.harvester.SparkLineageInitializer.withErrorHandling(SparkLineageInitializer.scala:176)
    at za.co.absa.spline.harvester.SparkLineageInitializer.createListener(SparkLineageInitializer.scala:138)
    at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.<init>(SplineQueryExecutionListener.scala:37)
rupesh3020 commented 1 year ago

Can this be a permission issue?

wajda commented 1 year ago

It could. Honestly I've never seen this error before. Are you doing anything special? What Databricks runtime version is it?

rupesh3020 commented 1 year ago

I created a custom dispatcher which also has authentication for client credential grant type. Databricks version is 9.1

rupesh3020 commented 1 year ago

I hardcoded the version of spline agent to 3.2.1, may be that is the issue as i also saw this in logs:

23/02/24 10:40:04 INFO AsyncEventQueue: Process of event SparkListenerSQLExecutionEnd(0,1677235201730) by listener ExecutionListenerBus took 2.698032533s.
23/02/24 10:40:04 ERROR Utils: uncaught error in thread spark-listener-group-shared, stopping SparkContext
java.lang.NoSuchMethodError: org.json4s.CustomSerializer.<init>(Lscala/Function1;Lscala/reflect/ClassTag;)V
    at org.json4s.ext.UUIDSerializer$.<init>(JavaTypesSerializers.scala:30)
    at org.json4s.ext.UUIDSerializer$.<clinit>(JavaTypesSerializers.scala)
    at org.json4s.ext.JavaTypesSerializers$.<init>(JavaTypesSerializers.scala:26)
    at org.json4s.ext.JavaTypesSerializers$.<clinit>(JavaTypesSerializers.scala)
    at za.co.absa.commons.json.format.JavaTypesSupport.formats(JavaTypesSupport.scala:25)
    at za.co.absa.commons.json.format.JavaTypesSupport.formats$(JavaTypesSupport.scala:23)
    at __wrapper$1$25f9a300cf1145849f37e85c15d59942.__wrapper$1$25f9a300cf1145849f37e85c15d59942$$anon$1.formats(<no source file>:2)
    at za.co.absa.commons.json.AbstractJsonSerDe$EntityToJson.<init>(AbstractJsonSerDe.scala:33)
    at za.co.absa.commons.json.AbstractJsonSerDe.EntityToJson(AbstractJsonSerDe.scala:32)
    at za.co.absa.commons.json.AbstractJsonSerDe.EntityToJson$(AbstractJsonSerDe.scala:32)
    at __wrapper$1$25f9a300cf1145849f37e85c15d59942.__wrapper$1$25f9a300cf1145849f37e85c15d59942$$anon$1.EntityToJson(<no source file>:2)
wajda commented 1 year ago

I hardcoded the version of spline agent to 3.2.1

Hm. Databricks Runtime 9.1 is based on Spark 3.1.2, not 3.2.1. Make sure you use the correct build of Spline bundle. Please also use the latest released version of it (1.0.4 at this time)