Observing some unexplained behavior with/without a schema definition when reading. Maybe something in common with this issue.
My understanding is that the method StaxXmlParserUtils#currentStructureAsString should be called whenever it's converting a field declared as StringType.
The problem is that the method does not get called when providing a schema.
I took the first test from XmlSuite that calls this method. When enforcing a schema, the method is no longer being used.
test("DSL test with mixed elements (struct, string)") {
val schema = buildSchema(
field("age", IntegerType),
struct("name", field("firstName"))
)
val results = spark.read
.option("rowTag", "person")
.schema(schema)
.xml(resDir + "ages-mixed-types.xml")
.collect()
assert(results.length === 3)
}
It seems that when using a schema, we never enter the case within convertComplicatedType
...
case _: StringType => StaxXmlParserUtils.currentStructureAsString(parser)
Observing some unexplained behavior with/without a schema definition when reading. Maybe something in common with this issue. My understanding is that the method
StaxXmlParserUtils#currentStructureAsString
should be called whenever it's converting a field declared asStringType
. The problem is that the method does not get called when providing a schema.I took the first test from
XmlSuite
that calls this method. When enforcing a schema, the method is no longer being used.It seems that when using a schema, we never enter the case within
convertComplicatedType
Maybe I defined a wrong schema?