apache / fury

A blazingly fast multi-language serialization framework powered by JIT and zero-copy.
https://fury.apache.org/
Apache License 2.0
3.11k stars 248 forks source link

[Question] For the Java PB object generated by Protobuf, the serialization and deserialization using Fury are slower than using Protobuf. #1945

Closed a1342772 closed 2 days ago

a1342772 commented 3 days ago
GrpcService.ModelInferRequest` resp = GrpcService.ModelInferRequest
                    .newBuilder()
                    .addAllInputs(inputs)
                    .build();

             byte[] data = fury.serializeJavaObject(resp);
             fury.deserializeJavaObject(data, GrpcService.ModelInferRequest.class);

             byte[] data = resp.toByteArray();
             GrpcService.ModelInferRequest.parseFrom(data); 
a1342772 commented 3 days ago

@chaokunyang

chaokunyang commented 2 days ago

@a1342772 Java PB object generated by Protobuf is an internal object representation in Protobuf, it hold many internal state for protobuf usage only. You should not serialize such objects using other seiralization framework. You could define your data using POJO and pass such obejcts to Fury instead.

chaokunyang commented 2 days ago

@a1342772 If you do want serialize such Protobuf generated objects using Fury, you could do following optimization:

At last, Protobuf generated objects cached many fields data such as booleanArrayMemoizedSerializedSize/longArrayMemoizedSerializedSize/memoizedHashCode/memoizedIsInitialized/memoizedSize/bitField0_/, they are all redundant and only meaningful to Protobuf. You shoud find a way to skip serialization of those fields.

a1342772 commented 2 days ago

1.The Java POJO is too large, and directly serializing it with Fury takes a long time. Therefore, it's necessary to convert it into a Protobuf (pb) object before serialization. Do you have any solutions to this problem? 2.Do you have examples of Java serializing Protobuf objects? @chaokunyang chao

chaokunyang commented 2 days ago

@a1342772 could you share the full proto definition here for the data you want to serialize? We can't do anything with the code you post in this issue

chaokunyang commented 2 days ago

1.The Java POJO is too large, and directly serializing it with Fury takes a long time. Therefore, it's necessary to convert it into a Protobuf (pb) object before serialization. Do you have any solutions to this problem? 2.Do you have examples of Java serializing Protobuf objects? @chaokunyang chao

I don't think serializing the pojo with Fury directly will take a longer time. There are already many companies switches from pb to fury and they all get several times speedup. If you could a fully reproducible code, and share jmh benchmark results, we could take a look at it. Otherwise, i don't think we can do anything.

a1342772 commented 2 days ago

The application scenario is search and recommendation, where proto objects need to be constructed for model inference,The proto file is as follows: grpc_service.proto.txt

The test code is as follows:

static Fury furyTest;

static {
    furyTest = Fury.builder()
            .withLanguage(Language.JAVA)
            .requireClassRegistration(false)
            .withCodegen(false)
            .withRefTracking(false)
            .withAsyncCompilation(true)
            .build();
}

public static void main(String[] args) {
    Map<String, ArrayList<FeaOutputV3>> itemsFeatDataV3 = new HashMap<>();
    int numFeatures = 320;
    int numColumns = 20;

    // 模拟特征数据
    for (int i = 0; i < numFeatures; i++) {
        String featureName = "feature_" + i;
        ArrayList<FeaOutputV3> featureDataList = new ArrayList<>();

        for (int j = 0; j < numColumns; j++) {
            IntList featureValues = new IntArrayList();
            for (int k = 0; k < numColumns; k++) {
                featureValues.add(i * numColumns + k);
            }
            FeaOutputV3 feaOutputV3 = new FeaOutputV3();
            feaOutputV3.setFixSizeList(featureValues);
            featureDataList.add(feaOutputV3);
        }

        itemsFeatDataV3.put(featureName, featureDataList);
    }

    // 构建输入张量
    GrpcService.ModelInferRequest.InferInputTensor.Builder tensorBuilder =
            GrpcService.ModelInferRequest.InferInputTensor.newBuilder();
    GrpcService.InferTensorContents.Builder tensorContentBuilder =
            GrpcService.InferTensorContents.newBuilder();

    List<GrpcService.ModelInferRequest.InferInputTensor> inputs = new ArrayList<>();
    itemsFeatDataV3.forEach((name, featList) -> {
        int l1 = featList.size();
        if (CollectionUtils.isNotEmpty(featList.get(0).getFixSizeList())) {
            int l2 = featList.get(0).getFixSizeList().size();
            IntList features = new IntArrayList(l1 * l2);

            featList.forEach(feaOutput -> features.addAll(feaOutput.getFixSizeList()));
            inputs.add(featureBuildWithFast(tensorBuilder, tensorContentBuilder, features, name, l1, l2));
        }
    });

    // 测试序列化性能
    for (int count = 0; count < 50; count++) {
        long startSerialization = System.nanoTime();
        GrpcService.ModelInferRequest resp = GrpcService.ModelInferRequest
                .newBuilder()
                .addAllInputs(inputs)
                .build();

        byte[] data = furyTest.serializeJavaObject(resp);
        furyTest.deserializeJavaObject(data, GrpcService.ModelInferRequest.class);

        //protobuf
        //byte[] data = resp.toByteArray();
        //GrpcService.ModelInferRequest.parseFrom(data);
        long endSerialization = System.nanoTime();
        long serializationTimeMs = (endSerialization - startSerialization) / 1_000_000;

        System.out.println("Serialization time (milliseconds): " + serializationTimeMs + " size: " + data.length);
    }
}

public static GrpcService.ModelInferRequest.InferInputTensor featureBuildWithFast(
        GrpcService.ModelInferRequest.InferInputTensor.Builder tensorBuilder1,
        GrpcService.InferTensorContents.Builder tensorContentBuilder2,
        IntList features, String featureName, int l1, int l2) {

    GrpcService.ModelInferRequest.InferInputTensor.Builder tensorBuilder =
            GrpcService.ModelInferRequest.InferInputTensor.newBuilder();
    GrpcService.InferTensorContents.Builder tensorContentBuilder =
            GrpcService.InferTensorContents.newBuilder();
    if (features == null) {
        features = new IntArrayList();
    }
    tensorContentBuilder.addAllIntContents(features);
    GrpcService.ModelInferRequest.InferInputTensor input = tensorBuilder.setName(featureName)
            .setDatatype("INT32")
            .addShape(l1)
            .addShape(l2)
            .setContents(tensorContentBuilder)
            .buildPartial();
    return input;
}`
a1342772 commented 2 days ago

@chaokunyang The bytes obtained by directly serializing a Java object "itemsFeatDataV3" with Fury are several times larger than those obtained by serializing the object after constructing the Protobuf object.

chaokunyang commented 2 days ago

You disabled FURY JIT, the serialization will be slow:

furyTest = Fury.builder()
            .withLanguage(Language.JAVA)
            .requireClassRegistration(false)
            .withCodegen(false)
            .withRefTracking(false)
            .withAsyncCompilation(true)
            .build();

please create fury like this:

fury = Fury.builder()
            .withLanguage(Language.JAVA)
            .withRefTracking(false)
            .build();

And register all the types, please don't disable class registration:

fury.register(GrpcService.ModelInferRequest.InferInputTensor.class);
fury.register(GrpcService.ModelInferRequest.class);
fury.register(GrpcService.InferTensorContents.class);

And please warm serialization several times before benchmark.

a1342772 commented 2 days ago

@chaokunyang 企业微信截图_018b542e-f54a-4381-884b-b752b1b221d1 The protected object cannot be registered.

chaokunyang commented 2 days ago

@a1342772 You could pass a qualified string instead like fury.register("com.google.protobuf.GeneratedMessageLite.SerializedForm") instead. Please take a look at the Java API in fury class.

a1342772 commented 2 days ago

企业微信截图_aeb7288a-c8d8-4d8d-baf5-c688ef301437 Version 0.9.0 does not provide the corresponding API.

chaokunyang commented 2 days ago

Use fury.getClassResolver().register() instead. I forgot to add it to BaseFury interface

chaokunyang commented 2 days ago

You can also invoke Class.forname() instead

a1342772 commented 2 days ago

After a 200-time warmup with the following configuration: 企业微信截图_6931dea3-8736-494f-9579-a0634800b81b

Direct serialization and deserialization of proto objects take 20ms using Protobuf, while Fury takes 23ms. Fury cannot be applied directly to proto objects. Is there a way to bridge this gap?
企业微信截图_6169b184-63e8-48de-8652-63deb26ba1d8

a1342772 commented 2 days ago

@chaokunyang Are there any solutions for serializing and deserializing proto objects using Fury?

chaokunyang commented 2 days ago

@a1342772 Could you give me a unit test which I can run directly locally?

a1342772 commented 2 days ago

@chaokunyang Of course, I can. I have built a very simple Java project. Directly run the main method in the MAIN class, which provides two serialization methods: Fury and Protobuf. fury.zip

chaokunyang commented 2 days ago

image

Protobuf generated classes implements JDK writeReplace method, so Fury invoke that method for compatibility.

chaokunyang commented 2 days ago

Protobuf generated classes can only be serialized by Protobuf, you need to use PB for serialization. Otherwise you should pass your POJO objects.

chaokunyang commented 2 days ago

The PB generated objects even have weakmap and circular ref: image

Such fields are used by protobuf only, It's not possible to serialize it by other frameworks.

chaokunyang commented 2 days ago

Here is the code to work around ReplaceResolve:

        furyTest = Fury.builder()
                .withLanguage(Language.JAVA).requireClassRegistration(false)
                .withRefTracking(true)
                .build();
        furyTest.getClassResolver().setSerializerFactory((f, c) -> {
            if (Message.class.isAssignableFrom(c)) {
                return Serializers.newSerializer(f, c, f.getClassResolver().getObjectSerializerClass(c, x -> {}));
            }
            return null;
        });

with this code, you can get the above error.

chaokunyang commented 2 days ago

It's possible to extend Fury to serialize pb generated in a special way to get the speed up you want, but that will take time and lots of efforts.

a1342772 commented 1 day ago

@chaokunyang The pojo object is much larger than the pb object. If the pojo is not converted to the pb object and is serialized directly, the serialized size will be very large.

a1342772 commented 1 day ago

Do you have any suggestions for this issue?

chaokunyang commented 4 hours ago

@chaokunyang The pojo object is much larger than the pb object. If the pojo is not converted to the pb object and is serialized directly, the serialized size will be very large.

Maybe we could look into why the serialized size of POJO is larger. Perhaps we can make it smaller, which might be the right direction.

Your case are very similiar with inference in search and recommendation systems. Fury has been appied into many such scenarios and they all get 30ms letency reduction.

Some suggestions here:

We're wokring on python serialization optimization, your cases are very interesting, do you need to pass such objects from java to python? If so, maybe fury can help you speed up your python inference service too.