Open ghostdogpr opened 11 months ago
Good point! I want to add such features before but haven't the time for it.
We have a writeRef check method in io.fury.resolver.ClassResolver
:
public boolean needToWriteRef(Class<?> cls) {
if (fury.trackingRef()) {
ClassInfo classInfo = getClassInfo(cls, false);
if (classInfo == null || classInfo.serializer == null) {
// TODO group related logic together for extendability and consistency.
return !cls.isEnum();
} else {
return classInfo.serializer.needToWriteRef();
}
}
return false;
}
It's used mainly in Collection/Map
element serialization or in the process of FURY codegen. Here is the design consideration we make before:
Do not invoke this method everytime when a new object is being serializing, since it introduces a hashmap cost, whose cost will be similar to reference tracking when the object graph is small. It only give better performance for big object graph. In such graph, map of object classes are much smaller than map of reference objects. So the query on whether to tracking ref is much smaller than tracking a ref.
So we made a tradeoff, if a object is registered for no-ref tracking, all of its subclass are no-ref tracking too mostly. In this way, we can ignore reference tracking check in the generated code for polymorphic types to minimize such check cost.
Hope this information can give you some inspiration how ref tracking works in fury, and let you write your refresolver.
Fury can provide a method to let you set the RefResolver
factory when configuring FuryBuilder
. We make the created refResolver as a final field of Fury
to reduce field access cost. You should pass a factory to let Fury create your refresolver.
Another method is provide a method in Fury such as trackingRef(Class, bool)
, you can invoke it to control which classes will be serialized by ref.
Or Fury provides an annotation to let you mark your classes or fields with trackingRef
, this is mentioned in #1148 too.
Those methods all make sense to me, and I believe Fury will support them all in the long run.
Is your feature request related to a problem? Please describe.
Disabling reference tracking gives much better performance, but it is a bit dangerous with some types.
What we do currently using Kryo is to make a custom
ReferenceResolver
where we implement our ownpublic boolean useReferences (Class type)
method. That way, we dynamically disable reference tracking on all our "known" safe types, but we still use it for other "unknown" types that may be circular (for example,Throwable
can be circular because ofcause
).Describe the solution you'd like
A way to customize reference tracking.
Additional context
For reference our current benchmarks:
Fury without tracking is almost 2x faster than Kryo without tracking. So we have good hope that with customized tracking we would achieve a better performance than customized Kryo.