Open tsl0922 opened 8 years ago
This can happen if you try to deserialize different versions of a class. Checking each field would slow down too much. I think it could be achieved by turning of Unsafe (type check is at a lower level then) but still this will yield a ~20% slowdown. Inspecting the code, the following should be done:
Thanks for the report :)
@RuedigerMoeller Does FST respect serialVersionUID? If we changed a class and it's serialVersionUID, will deserialization be success?
no, fst is absolutely trimmed for max performance. you might add a custom serializer for classes that require a version check
Am Dienstag, 24. November 2015 schrieb Shuanglei Tao :
@RuedigerMoeller https://github.com/RuedigerMoeller Does FST respect serialVersionUID? If we changed a class and it's serialVersionUID, will deserialization be success?
— Reply to this email directly or view it on GitHub https://github.com/RuedigerMoeller/fast-serialization/issues/103#issuecomment-159197999 .
I have same problem if new version add fields with same serialVersionUID, I hope fst provide an option to respect serialVersionUID for safe deserialize with new version class, It's very important for rpc, service provider upgraded and service consumer will fail if not upgraded which is normal in real world.
@quaff you should try protobuffer instead.
@thefallentree protobuf and thrift and avro need schema to generate NON-POJO code, It is not intuitional, I like the way similar with java.io.Serializable.
@RuedigerMoeller It seems Unsafe can be disabled for serialization by FSTUtil.unFlaggedUnsafe = null
without changing the source code of fst, is it right?
It should be like that, however there is the lazy init in "getUnsafe", so only direct references to "unflaggedUnsafe" are blocked, so double check this. Serialization has a fallback implementations if unsafe is not available (10-15%) penalty, some of the offheap stuff not.
General remark:
You need to plan ahead regarding version issues, there are several aproaches which do not impose a performance/message size penalty:
strategies for backward compatibility:
1) Do not modify classes but instead subclass from version 1. E.g. MyDataV1 extends MyData
.
2) I have also used package renaming to deal with different versions (my.package.v1, mypackage.v2).
3) put in a "Object extraData", which can be used as a container for additions later on (simple cases)
There is no such thing like automatical versioning, you have to deal with it explicitely at application level most of the time.
I'll still try to figure out a scheme to solve the issue without hurting use cases of fst handling versioning e.g. at connect time.
Thanks for you reply, that is very useful.
However, deserialize different versions of a class is very common in the real world. If fst can not ensure data correctness for this situation, you may add a note on the readme for new users, because it is different from java serialization, fst does not throw an exception for data error caused by version at deserialize time(it is hard to debug), but java serialization does.
its already in the readme afaik (limits or so). Its true this is not same behaviour as java serialization, but again I won't pay a significant performance penalty to support a feature only some need or that can be implemented without runtime cost by applying some foresight ;).
To be frank, I think a serialization library that doesn't talk much on versioning is not a good way of doing things. It feels that this project is ignoring how real application develops in the wild.
We were attracted by fst promise of fairyland and bitten by subtle versioning problems later. It's not a very promising sign of the project and we simply don't know what other issues may lay ahead to justify.
That being said, I totally agree with you that support versioning/field changes while being fast is hard. We may end up still looking at using protobuf later, sometime hard choice is making future life easier.
Cheers.
On Wed, Nov 25, 2015 at 8:58 PM, moru0011 notifications@github.com wrote:
its already in the readme afaik (limits or so). Its true this is not same behaviour as java serialization, but again I won't pay a significant performance penalty to support a feature only some need or that can be implemented without runtime cost by applying some foresight ;).
— Reply to this email directly or view it on GitHub https://github.com/RuedigerMoeller/fast-serialization/issues/103#issuecomment-159600262 .
Well, I'll put a 'limits' section in the readme. BTW: Has any of you guys read the doc about "@version" - annotation ? It supports extension of existing classes within some limits (which are hard to understand I admit).
One of the issues of stock jdk serialization is, it provides a lot of possibly unneeded features resulting in very poor performance. FST does the contrary: it implements core serialization very fast, ommiting costly features. With any tradeoff, there is a price to pay. However adding features works, removing features not :)
Adding app level version awareness is trivial, in case you face this issue after having rolled out your stuff: that's plain bad engineering. Having a space and performance impact for each object written just to enable a simple exception throing version check is a bad decision imo. That's not a way to build first class high performance RPC systems. Just use stock jdk serialization then.
Protobuf: Ofc you can do that, its just much more expensive regarding development cost and its way slower if you count in the fact you need to translate protobuf message structures to real application objects. Protobuf does not even support interlinked object graphs properly. Its a good solution for cross-language rpc, however its an expensive choice to build a distributed system in java in an elegant way. I'd choose SBE over protobof btw.
I agree, performance is important, but data correctness should be considered before performance, because we can't use unreliable data. I will try to add an app level version check for this, but I hope fst can handle this itself in the future.
i will add an "@checkversion" classlevel annotation + global option to always do this.
:+1:
Another request: can you make fst compatible with serialVersionUID
? I'd like to see it be implemented.
I'll use serialversionId if declared in the class, else I'll use a checksum computed from the type's and names of instance fields. This way objects will become incompatible only in case instance fields change (jdk ser is more picky in this case afaik). BTW: don't confuse me with a service provider, despite >6000 maven.org downloads from > 2500 distinct IP's (=pom.xml from projects all over the world) per month, I don't make a dime out of this.
In general I prioritize fst features I need for my commercial work, and I always handle versioning at app level as having thrown an exception rarely is sufficient to deal with the issue. So pull requests are welcome ^^.
@RuedigerMoeller Thanks!
This is the field of class Project:
This is the deserialized Object in debugger after fields changed(I'm running 2 different version of my application):
No exception is thrown when deserialize, but exception will be thrown when the result is used. I think FST should check if the deserialized field value matches the origin declaration and throw exception if it doesn't match.
I'm using fst 2.28, this issue also exists in the latest version(2.43).