microsoft / service-fabric-issues

This repo is for the reporting of issues found with Azure Service Fabric.
168 stars 21 forks source link

Removing a property from the class of Reliable Dictionary value throws an error #1660

Closed Fanchao92 closed 4 years ago

Fanchao92 commented 4 years ago

I am using out-of-box Reliable Dictionary in my SF actor. Both the key and value are custom class. And the value class inherits from a parent class. The parent class has a property of class type, from which I just removed a property. Simply speaking, it's like this:

[DataContract]
MyCustomReliableDictionaryValueClass: MyCustomBaseClass

[DataContract]
MyCustomBaseClass
{
    [DataMember]
    MyCustomPropertyClass property1;
}

[DataContract]
MyCustomPropertyClass
{
    [DataMember]
    int a;

    [DataMember]
    string b;

    [DataMember]
    ConcurrentDictionary<string, double> dictionary;// This property is removed from my latest code
}

Reliable Dictionary name remains unchanged.

Expected Behavior

Deployment should go smoothly.

Current Behavior

Got this error in my Windows Event Viewer: Message: 00000000-0000-0000-0000-7c4d59b82902@12294600004978481@fabric:/StateManager: Below type used in Reliable Collection urn:MyDictionaryName/dataStore could not be loaded. This commonly indicates that the user application is not backwards/forwards compatible. Common compatibility bugs that lead to this error are adding a new type or changing an assembly name without two phase upgrade, or removing a type. If this was caused by user's backwards/forwards compatibility bug, one way to mitigate the issue is to force the upgrade through without safety checks.

Steps to Reproduce

  1. Build and deploy the original code where MyCustomPropertyClass has the concurrent dictionary
  2. Remove the concurrent dictionary property from MyCustomPropertyClass
  3. Build and deploy the new code
ashishnegi commented 4 years ago

You are using the default serializer, which is not backward compatible. Please see here for details: https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-reliable-services-reliable-collections-serialization#upgradability

Your old data on disk has property dictionary which data contract serializer can't serialize back. Either you can write your custom serializer/deserializer or go through multiple phase upgrades where you can keep old/new types and rewrite whole data to new format and then remove old types.

Fanchao92 commented 4 years ago

@ashishnegi So can I do the following instead:

  1. Add back the dictionary property to MyCustomPropertyClass with default value being null and EmitDefaultValue being false. Since I've already removed the property from MyCustomPropertyClass , now there's no place in my code referencing this property. So other than the old values that will still have a non-null dictionary property, all the new values added to the reliable dictionary will have a null dictionary property.
  2. Roll out the code from step 1
  3. After I ensure all the values with the old schema have been removed from the reliable dictionary, I can safely remove the dictionary property from MyCustomPropertyClass.
  4. Roll out the new code

Also, could you detail how to "keep old/new types and rewrite whole data to new format and then remove old types" without implementing a custom serializer/deserializer?

Fanchao92 commented 4 years ago

@ashishnegi So can I do the following instead:

  1. Add back the dictionary property to MyCustomPropertyClass with default value being null and EmitDefaultValue being false. Since I've already removed the property from MyCustomPropertyClass , now there's no place in my code referencing this property. So other than the old values that will still have a non-null dictionary property, all the new values added to the reliable dictionary will have a null dictionary property.
  2. Roll out the code from step 1
  3. After I ensure all the values with the old schema have been removed from the reliable dictionary, I can safely remove the dictionary property from MyCustomPropertyClass.
  4. Roll out the new code
ashishnegi commented 4 years ago

@Fanchao92 I think your above method of making EmitDefaultValue being false should work. Can you confirm with a unit test where you force emitting this value during serialization [ simulating old values ] and then deserialize it back ? We just want deserialize to pass, you don't care about dictionary property.

Details on keeping old/new values:

  1. Create class MyCustomPropertyClassV2 which does not have this property.
  2. Use this new class in new ReliableDictionary with different name.
  3. Write all data from old ReliableDictionary to new ReliableDictionary. If old reliable dictionary is big, and you need to do it in background, you will need to change the business login to use 2 ReilableDictionaries.
  4. use RemoveAsync on ReliableDictionary.
  5. remove old class MyCustomPropertyClass. Upgrade again. This needs to be done after some time because RemoveAsync works in background.

/cc @zuhairp for any corrections in above steps.

This change can be cumbersome. Right way for any non trivial service is to use forward/backward compatible serializer/deserializer.

Fanchao92 commented 4 years ago

Hi @ashishnegi The deployment still failed due to the same reason after I added back the property and set it default to null and emittable only if non-null. Another thing that I might probably mention, is that I also changed a method name in MyCustomReliableDictionaryValueClass. Will this matter?

vinay-y commented 4 years ago

RCA Update:

In this case, assembly name was changed for the Value Type. Changing the assembly name is a breaking change from the StateManager point of view. <TypeName, Assembly Name> together is seen as the type and the types being used should be forward and backward compatible across app upgrades.

To check compatibility of app versions, this tool can be used. https://github.com/hiadusum/ReliableCollectionsMissingTypesTool/releases

To change the assembly name of a type used in Reliable Collections, the following multi phase upgrade should be used.

Let T be a type in assembly A1. Suppose <T, A1> is to be changed to <T, A2>, Phase 1: Add a new type <T, A2> but do not use it. Keep the type <T, A1> intact. Phase 2: Change all usage of <T, A1> to <T, A2>. The type <T, A1> should still exist but shouldn't be used. Phase 3: Remove <T, A1>

DataContract should be able to handle adding and removing data members. https://docs.microsoft.com/en-us/dotnet/framework/wcf/feature-details/data-contract-versioning#adding-and-removing-data-members For additional functionality custom serializer and deserializer can be used with forward and backward compatibility.