Open mattpollock opened 10 years ago
If I recall correctly, we were primarily focused on Avro files with the schema embedded. In that case, at least for the data we tested, record schemas were duplicated everywhere they appear in the schema("moments" would be defined in both places). It seems this not the case for the schema metadata in your Avro files?
The data was generated using PIG. When I attempted to explicitly define moments
throughout the schema it threw errors (protecting against my giving the same name to different types of records I think). I assumed that this was not unique to the avro/PIG handshake, but a general avro schema requirement. Perhaps that isn't the case.
Regardless, the way I defined the schema when saving the data and the way it pops out when using avro-tools getschema
on a resulting data file (which is what I pasted above) are consistent, defining moments
only once. This does not cause any hiccups for avro-tools tojson
. Also, messing around with the java API, calling fld.schema().getFields()
(where fld
is an object of type org.apache.avro.Schema.Field
) on fields where moments
is the type but is not explicitly defined (e.g., in the case of the terminalaltitude
field above) returned the expected fields (mean, variance, etc.) without any problem.
Hello,
I tested
read.avro
using a moderately complicated schema. Some fields contain sub-records, other fields contain arrays of records. One of the sub-records (namedmoments
and containing mean, variance, skewness, and kurtosis fields) is defined the first time and referenced as a type subsequently. This does not cause avro any problems, butread.avro
throws the following error:The schema being read here reads (in part):
Note that
moments
is defined as a type (as part of a union) for the first time in theinitialalttude
field, which is a field of theroutemetrics
record nested inside of the top-levelroute
field. After that,moments
is referenced by name in the subsequentterminalaltitude
field.Are there any plans to deal well with schemas like the one above?