open-metadata / openmetadata-spark-agent

Apache License 2.0
2 stars 2 forks source link

WARN LogicalPlanSerializer: Unable to writeValueAsString #3

Closed ulixius9 closed 7 months ago

ulixius9 commented 8 months ago

Fix the warning Unable to writeValueAsString, this results into infinite recursion and fails to generate the lineage attached the logs.

debug.txt

harshach commented 8 months ago

@ulixius9 can you give me steps to reproduce this, locally.

dolfinus commented 7 months ago

I'm getting almost similar error while trying to saving dataframe to Hive table:

from pyspark.sql import SparkSession

spark = (
  SparkSession.builder.master('local')
    .appName('sample_spark')
    .config("spark.jars", "./openmetadata-spark-agent-1.0-beta.jar")
    .config("spark.extraListeners", "org.openmetadata.spark.agent.OpenMetadataSparkListener")
    .config("spark.openmetadata.transport.hostPort", "http://localhost:8585")
    .config("spark.openmetadata.transport.type", "openmetadata")
    .config("spark.openmetadata.transport.jwtToken", "<token>")
    .config("spark.openmetadata.transport.pipelineServiceName", "airflow")
    .config("spark.openmetadata.transport.pipelineName", "my_dag")
    .config("spark.openmetadata.transport.pipelineSourceUrl", "http://airflow.domain/dags/my_dag")
    .config("spark.openmetadata.transport.pipelineDescription", "My ETL Pipeline")
    .config("spark.openmetadata.transport.timeout", "30")
    .enableHiveSupport()
    .getOrCreate()
)

df = spark.createDataFrame([{"field": "value"}])
df.write.saveAsTable("some")

openmetadata.log

Both Spark 3.4.1 and 3.5.0

dolfinus commented 7 months ago

It looks like this is the same issue as https://github.com/OpenLineage/OpenLineage/issues/2082. Setting up .config("spark.openlineage.facets.disabled", "[spark_unknown;spark.logicalPlan]") fixed the issue.

But the only object created in OpenMetadata is Spark pipeline, without any attributes and lineage