databricks / spark-xml

XML data source for Spark SQL and DataFrames
Apache License 2.0
500 stars 226 forks source link

read multiple XML file. and get file name as metadata #628

Closed writetoarun closed 1 year ago

writetoarun commented 1 year ago

I am trying to read multiple XML files into a spark data frame and trying to get the file name of XML into the data frame using the below statement

spark.read.format("com.databricks.spark.xml").option("rowTag","row").load("/raw/Batch/////").select("*", "_metadata")

but get an error _metadata not found

srowen commented 1 year ago

I am not sure what you're referring to. What is _metadata?

srowen commented 1 year ago

Oh, that might be a property of DSv2 or something, or specific to Databricks. Don't think that exists for this data source