apache / fury

A blazingly fast multi-language serialization framework powered by JIT and zero-copy.
https://fury.apache.org/
Apache License 2.0
3.13k stars 252 forks source link

[Question] Cannot add field in `CompatibleMode.SCHEMA_CONSISTENT` mode in deserialization operation ? Or is it a bug ? #1880

Closed liqipeng closed 1 month ago

liqipeng commented 1 month ago

Question

Language: Java

Version: 0.8.0

Step1: Serialize object without private Integer age;

public class Demo1 {
    @Data
    public static class Class1 {
        private String name;
        // private Integer age;
    }
    public static void main(String[] args) throws IOException {
        Fury fury = Fury.builder()
                .withLanguage(Language.JAVA)
                .withRefTracking(true)
                .requireClassRegistration(false)
                .withCompatibleMode(CompatibleMode.SCHEMA_CONSISTENT)
                .build();
        File file = new File("test.dat");

        Class1 obj1 = new Class1();
        obj1.setName("Tom");
        byte[] data1 = fury.serialize(obj1);
        FileUtils.writeByteArrayToFile(file, data1);
    }
}

Step2: Derialize object with private Integer age;

public class Demo1 {
    @Data
    public static class Class1 {
        private String name;
        private Integer age;
    }
    public static void main(String[] args) throws IOException {
        Fury fury = Fury.builder()
                .withLanguage(Language.JAVA)
                .withRefTracking(true)
                .requireClassRegistration(false)
                .withCompatibleMode(CompatibleMode.SCHEMA_CONSISTENT)
                .build();
        File file = new File("test.dat");

        byte[] data2 = FileUtils.readFileToByteArray(file);
        Class1 obj2 = (Class1) fury.deserialize(data2);
        Objects.equals("Tom", obj2.name);
    }
}

Throw exception:

Exception in thread "main" org.apache.fury.exception.DeserializationException: Deserialize failed, read objects are: [Demo1.Class1(name=null, age=6)]
    at org.apache.fury.util.ExceptionUtils.handleReadFailed(ExceptionUtils.java:63)
    at org.apache.fury.Fury.deserialize(Fury.java:796)
    at org.apache.fury.Fury.deserialize(Fury.java:714)
    at com.example.Demo1.main(Demo1.java:33)
Caused by: java.lang.IllegalArgumentException: 2
    at org.apache.fury.util.Preconditions.checkArgument(Preconditions.java:52)
    at org.apache.fury.serializer.StringSerializer.readUtf8(StringSerializer.java:252)
    at org.apache.fury.serializer.StringSerializer.readCompressedCharsString(StringSerializer.java:247)
    at com.example.Demo1_Class1FuryRefCodec_0.read(Demo1_Class1FuryRefCodec_0.java:69)
    at org.apache.fury.Fury.readDataInternal(Fury.java:958)
    at org.apache.fury.Fury.readRef(Fury.java:860)
    at org.apache.fury.Fury.deserialize(Fury.java:792)
liqipeng commented 1 month ago

Currently, SCHEMA_CONSISTENT and COMPATIBLE modes are completely incompatible. I think that COMPATIBLE mode keep compatible with SCHEMA_CONSISTENT mode maybe necessary, so that SCHEMA_CONSISTENT mode can be easily switched to SCHEMA_CONSISTENT mode when the need for SCHEMA changes occurs.

chaokunyang commented 1 month ago

We don't have plan to make SCHEMA_CONSISTENT and COMPATIBLE modes compatible with each other. If your scheme may change, you need to use COMPATIBLE from the start

liqipeng commented 1 month ago

Currently, SCHEMA_CONSISTENT and COMPATIBLE modes are completely incompatible. I think that COMPATIBLE mode keep compatible with SCHEMA_CONSISTENT mode maybe necessary, so that SCHEMA_CONSISTENT mode can be easily switched to SCHEMA_CONSISTENT mode when the need for SCHEMA changes occurs.

@chaokunyang 不好意思,可能蹩脚英文没说清楚,我想表达的是可以考虑一种单向的兼容,仅支持从SCHEMA_CONSISTENT迁移至COMPATIBLE即可。因为可能schema设计者一开始考虑不够充分使用了SCHEMA_CONSISTENT,后来有需求增加字段,他可以将模式切换为兼容模式来实现需求。

对fury底层原理暂不了解,先猜测假设一下,如果实现这种单向模式的兼容,不需要对fury底层做变更,更多是api层面的兼容,当选择COMPATIBLE模式执行反序列化时先探测一下二进制数据的模式(假定有相应的标识字段),如果实际数据是来自SCHEMA_CONSISTENT则正常按SCHEMA_CONSISTENT反序列化即可。

这样的兼容是可以提高fury的易用性的。目前默认为SCHEMA_CONSISTENT,如果初级使用者没深入理解这两种模式的差异是很容易做出错误决策的,当前两种模式完全不支持迁移会导致切换模式成本很高。

chaokunyang commented 1 month ago

We can't do that. Schema compatible mode need to write class meta for deserialization, scheme consistent mode doesn't write that meta

liqipeng commented 1 month ago

@chaokunyang 目前是否有工具api可以探测序列化数据是哪种模式(SCHEMA_CONSISTENT或者COMPATIBLE)呢? (类似于探测JDK 序列化的工具JavaSerializer.serializedByJDK(byte[] data)

chaokunyang commented 1 month ago

We don't provide such tools, you can write such a header before write FURY serialized data

liqipeng commented 1 month ago

We don't provide such tools, you can write such a header before write FURY serialized data

Got it. Thank you for your reply.