GeethanadhP / xml-avro

Convert XSD -> AVSC and XML -> AVRO
Apache License 2.0
36 stars 26 forks source link

Integer variables in XSD are converted as String in AVSC #23

Closed jkckiran closed 5 years ago

jkckiran commented 5 years ago

Hi @GeethanadhP We noticed that all the "integer" variables in xsd are getting converted to "string" in avsc.

Any specific reason for this ?

GeethanadhP commented 5 years ago

are your referring to xs:integer?? there are 2 int's in xsd's refer Accepted answer in this https://stackoverflow.com/questions/15336872/xsd-what-is-the-difference-between-xsinteger-and-xsint

xs:int is 32bit (the same specification as integer in avro) https://avro.apache.org/docs/1.8.1/spec.html xs:integer is unbounded (has no max value, so its possible that it cannot fix the 32 bit integer if we try to store it in avro, so we use string for that)

A solution is if you are sure those variables will be in 32 bit range then convert them to xs:int or xs:long if they will be in 64 bit range But thats not best way as you would have to do it everytime the source team modifies the XSD, so the best solution is cast it as required when reading in spark

jkckiran commented 5 years ago

Thanks @GeethanadhP for your inputs. yes it is xs:integer