dapr / java-sdk

Dapr SDK for Java
Apache License 2.0
260 stars 206 forks source link

[RECALLED] Wrong value in state store when use custom json serializer #503

Open abogdanov37 opened 3 years ago

abogdanov37 commented 3 years ago

I use dapr v1.0.0. I wrote the actor use Java SDK. To use the right OffsetDateTime class serialization I wrote a custom Json serializer class ISO8601DaprObjectSerializer extends ObjectSerializer implements DaprObjectSerializer.

Expected Behavior

When I use this serializer I want to see JSON in state store

Actual Behavior

Now in state store I see base64 encoded string. In this case, I should use a special deserialized object constructor with string parameter.

Steps to Reproduce the Problem

When register actor add custom serializer

import com.fasterxml.jackson.annotation.JsonInclude;
import com.fasterxml.jackson.core.JsonGenerator;
import com.fasterxml.jackson.databind.*;
import com.fasterxml.jackson.datatype.jsr310.JavaTimeModule;
import io.dapr.client.ObjectSerializer;
import io.dapr.client.domain.CloudEvent;
import io.dapr.serializer.DaprObjectSerializer;
import io.dapr.utils.TypeRef;

import java.io.IOException;
import java.time.OffsetDateTime;
import java.time.ZoneId;
import java.time.format.DateTimeFormatter;
import java.util.regex.Pattern;

public class ISO8601DaprObjectSerializer extends ObjectSerializer implements DaprObjectSerializer {

    private static final Pattern PATTERN_ISO8601 = Pattern.compile("^(-?(?:[1-9][0-9]*)?[0-9]{4})-(1[0-2]|0[1-9])-(3[01]|0[1-9]|[12][0-9])T(2[0-3]|[01][0-9]):([0-5][0-9]):([0-5][0-9])(\\.[0-9]+)?(Z|[+-](?:2[0-3]|[01][0-9]):[0-5][0-9])?$");
    private static final DateTimeFormatter ISO8601 = DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSSX").withZone(ZoneId.of("UTC"));

    private static final JavaTimeModule TIME_MODULE = new JavaTimeModule();
    private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper()
            .configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
            .configure(MapperFeature.ACCEPT_CASE_INSENSITIVE_PROPERTIES, true)
            .setSerializationInclusion(JsonInclude.Include.NON_NULL)
            .disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS)
            .registerModule(TIME_MODULE);

    static  {
        TIME_MODULE.addSerializer(OffsetDateTime.class, new JsonSerializer<>() {
            @Override
            public void serialize(OffsetDateTime offsetDateTime, JsonGenerator jsonGenerator, SerializerProvider serializerProvider) throws IOException {
                jsonGenerator.writeString(offsetDateTime.format(ISO8601));
            }

        });
    }

    @Override
    public byte[] serialize(Object state) throws IOException {
        if (state == null) {
            return null;
        }

        if (state.getClass().equals(Void.class)) {
            return null;
        }

        // Have this check here to be consistent with deserialization (see deserialize() method below).
        if (state instanceof byte[]) {
            return (byte[]) state;
        }

        // Not string, not primitive, so it is a complex type: we use JSON for that.
        return OBJECT_MAPPER.writeValueAsBytes(state);
    }

    @Override
    public <T> T deserialize(byte[] content, TypeRef<T> type) throws IOException {
        var javaType = OBJECT_MAPPER.constructType(type.getType());

        if ((javaType == null) || javaType.isTypeOrSubTypeOf(Void.class)) {
            return null;
        }

        if (javaType.isPrimitive()) {
            return deserializePrimitives(content, javaType);
        }

        if (content == null) {
            return (T) null;
        }

        // Deserialization of GRPC response fails without this check since it does not come as base64 encoded byte[].
        if (javaType.hasRawClass(byte[].class)) {
            return (T) content;
        }

        if (content.length == 0) {
            return (T) null;
        }

        if (javaType.hasRawClass(CloudEvent.class)) {
            return (T) CloudEvent.deserialize(content);
        }

        return OBJECT_MAPPER.readValue(content, javaType);
    }

    @Override
    public String getContentType() {
        return "application/json";
    }

    /**
     * Parses a given String to the corresponding object defined by class.
     *
     * @param content  Value to be parsed.
     * @param javaType Type of the expected result type.
     * @param <T>      Result type.
     * @return Result as corresponding type.
     * @throws IOException if cannot deserialize primitive time.
     */
    private static <T> T deserializePrimitives(byte[] content, JavaType javaType) throws IOException {
        if ((content == null) || (content.length == 0)) {
            if (javaType.hasRawClass(boolean.class)) {
                return (T) Boolean.FALSE;
            }

            if (javaType.hasRawClass(byte.class)) {
                return (T) Byte.valueOf((byte) 0);
            }

            if (javaType.hasRawClass(short.class)) {
                return (T) Short.valueOf((short) 0);
            }

            if (javaType.hasRawClass(int.class)) {
                return (T) Integer.valueOf(0);
            }

            if (javaType.hasRawClass(long.class)) {
                return (T) Long.valueOf(0L);
            }

            if (javaType.hasRawClass(float.class)) {
                return (T) Float.valueOf(0);
            }

            if (javaType.hasRawClass(double.class)) {
                return (T) Double.valueOf(0);
            }

            if (javaType.hasRawClass(char.class)) {
                return (T) Character.valueOf(Character.MIN_VALUE);
            }

            return null;
        }

        return OBJECT_MAPPER.readValue(content, javaType);
    }

    public static ObjectMapper getMapper() {
        return OBJECT_MAPPER;
    }
}

Release Note

RELEASE NOTE: FIX Serialization bug in SDK where custom JSON serializer is handled as byte[]

artursouza commented 3 years ago

Please, do not extend ObjectSerializer, it is used for internal objects only. Simply, implement DaprObjectSerializer interface. The PR should focus on handling the content-type value correctly and not extending DefaultObjectSerializer.

artursouza commented 3 years ago

@abogdanov37 This fix must be undone because it is a breaking change. Data stored with the old version cannot be read with this patch. We are reverting this and undoing that fix in the patch release too.

abogdanov37 commented 3 years ago

Should I add fallback to support old way?

artursouza commented 3 years ago

@abogdanov37 For this feature to be added, the old way should be the default and the new way must be opted in. The design would need to be a little bit creative here. Take your time to think about this.

I am sorry for this but we are in a different mode now in post v1.0 where breaking changes are a big deal, specially when dealing with persisted data.

Anyhow, I really appreciate your open mind here and willingness to contribute.

abogdanov37 commented 3 years ago

I understand the problem with breaking changes. I take some time to think about the way and add changes.

abogdanov37 commented 3 years ago

@abogdanov37 This fix must be undone because it is a breaking change. Data stored with the old version cannot be read with this patch. We are reverting this and undoing that fix in the patch release too.

@artursouza I add new PR with some new tests. Please describe with more detail which state can't be deserialized. I spend many time but can't reproduce that behavior.

artursouza commented 3 years ago

@abogdanov37 Like we discussed in the issue, I will look into that before merging the PR. Ideally, I can come up with a new unit tests that can gate this regression. If my reading of the code is wrong, we can merge the PR.