FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
314 stars 136 forks source link

Missing "null" default values when generating schema #207

Open idkw opened 4 years ago

idkw commented 4 years ago

After upgrading from jackson 2.10.3 to 2.11.0 and using avro 1.9.2, I noticed that the schema generation is broken regarding default "null" values for union types.

Exemple POJO :

public class Book {

    @JsonProperty(defaultValue = "null")
    private String title;

    public Book() {
    }

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

}

Code to generate the JSON schema :

AvroMapper mapper = new AvroMapper(new AvroFactory());
com.fasterxml.jackson.dataformat.avro.schema.AvroSchemaGenerator gen = new com.fasterxml.jackson.dataformat.avro.schema.AvroSchemaGenerator();
mapper.acceptJsonFormatVisitor(Book.class, gen);
AvroSchema schemaWrapper = gen.getGeneratedSchema();
org.apache.avro.Schema schema = schemaWrapper.getAvroSchema();
System.out.println(schema.toString(true));

With version 2.10.3 we had :

{
    "type": "record",
    "name": "Book",
    "namespace": "com.example",
    "fields": [
        {
            "name": "title",
            "type": [
                "null",
                "string"
            ],
            "default": null
        }
    ]
}

Now with version 2.11.0 we have :

{
    "type": "record",
    "name": "Book",
    "namespace": "com.example",
    "fields": [
        {
            "name": "title",
            "type": [
                "null",
                "string"
            ]
        }
    ]
}

The default field in the schema is missing when generating the schema with version 2.11.0

After short analysis, I have noticed than in the org.apache.avro class, the method hasDefaultValue always returns false, so when generating the json schema output in the fieldsToJson method, this code never write the default field :

        if (f.hasDefaultValue()) {
          gen.writeFieldName("default");
          gen.writeTree(f.defaultValue());
        }

The Field instance is instanciated by com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor and it seems to mess up with the default value :


        JsonNode defaultValue = AvroSchemaHelper.parseDefaultValue(prop.getMetadata().getDefaultValue());
        writerSchema = this.reorderUnionToMatchDefaultType(writerSchema, defaultValue);
        Field field = new Field(prop.getName(), writerSchema, prop.getMetadata().getDescription(), AvroSchemaHelper.jsonNodeToObject(defaultValue));
        AvroMeta meta = (AvroMeta)prop.getAnnotation(AvroMeta.class);
        if (meta != null) {
            field.addProp(meta.key(), meta.value());
        }

When debugging, with this Book class :

Did something change to how I should specify a default value on a POJO field ? I tried with @JsonProperty and @AvroDefault but none work.

idkw commented 4 years ago

The issue seems to arise when using the avro dependency >= 1.9.x due to this change : https://issues.apache.org/jira/browse/AVRO-2369

The 1.9.x avro dependency in its Field constructor does this weird check instead of checking for NullNode directly:

    public Field(String name, Schema schema, String doc, Object defaultValue) {
      this(name, schema, doc,
          defaultValue == NULL_DEFAULT_VALUE ? NullNode.getInstance() : JacksonUtils.toJsonNode(defaultValue), true,
          Order.ASCENDING);
    }

However the NULL_DEFAULT_VALUE field is just defined as Object NULL_DEFAULT_VALUE = new Object(); and didn't exist in the 1.8.x releases

To fix it I need to either use reflection or upgrade the avro dependency to 1.9.2 and fix the jsonNodeToObject method.

    public static Object jsonNodeToObject(JsonNode defaultJsonValue) {
        if(defaultJsonValue != null && defaultJsonValue.isNull()) {
            return Schema.Field.NULL_DEFAULT_VALUE;
        }
        return DEFAULT_VALUE_MAPPER.convertValue(defaultJsonValue, Object.class);
    }

After reading some of the currently opened issues it appears there is a lot of friction and incompataibilities that arises from the fact of upgrading avro to 1.9.2 As I'm starting a new project, I don't have such friction, so I'll fork the 2.11 branch and will wait for a proper fix in the 3.x release

By the way, when is the 3.x release due to be released approximately ?

cowtowncoder commented 4 years ago

No good idea on 3.x, but I would not necessary wait for that. 2.12 should be finalized within 1-2 months, released by end of September, so some fixes (progress towards better avro-lib 1.9 or beyond compatibility) could go there.

Chuckame commented 11 months ago

Found a solution (not a perfect one, but doing the work):

It also handles other @NotNull annotations.

It will put null for all non required/@NotNull fields.

It will put [] for all required/@NotNull lists.


        AvroMapper avroMapper = AvroMapper.builder()
                .addModule(new SimpleModule(){
                    @Override
                    public void setupModule(SetupContext context) {
                        context.insertAnnotationIntrospector(new AnnotationIntrospector() {
                            @Override
                            public Boolean hasRequiredMarker(AnnotatedMember m) {
                                if (isRequired(m))
                                    return true;
                                return null;
                            }

                            private boolean isRequired(Annotated ann) {
                                return ann.hasOneOf(new Class[]{NotNull.class, org.jetbrains.annotations.NotNull.class});
                            }

                            @Override
                            public String findPropertyDefaultValue(Annotated ann) {
                                if (!isRequired(ann)) {
                                    return "null";
                                }
                                if (ann.getType().isCollectionLikeType()) {
                                    return "[]";
                                }
                                return null;
                            }

                            @Override
                            public Version version() {
                                return Version.unknownVersion();
                            }
                        });
                    }
                })
cowtowncoder commented 11 months ago

Thank you for sharing @Chuckame

Skjolberg commented 6 months ago

@cowtowncoder try using @AvroDefault

cowtowncoder commented 6 months ago

@Skjolberg could you elaborate?

Skjolberg commented 6 months ago

@cowtowncoder you need to set a default value in some fields, right? Well with the @AvroDefault annotation in Java you can do this when generating the schema.

cowtowncoder commented 6 months ago

@Skjolberg This is about generating schema via Avro module, from regular POJO that does not use Avro annotations. It looks like @AvroDefault was mentioned in an earlier comment as also not working (not being recognized or applied).

So I was thinking you might have an idea of how processing needs to be changed or so, based on earlier discussions.

Skjolberg commented 6 months ago

@cowtowncoder I'm using 2.15.3 on Jackson and 1.11.0 on avro, what I said above is working in my case.

cowtowncoder commented 6 months ago

@Skjolberg Interesting. Previous comments suggested it didn't, but perhaps they tried using something pre apache avro 1.11.0? We do have some issues wrt jackson-dataformat-avro and post-1.8.2 avro-lib, regarding tests but I guess runtime one can use overrides.