dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.23k stars 4.45k forks source link

IXmlSerializable makes XmlSerializer Deserialize creates instance of base class instead of derived class #101654

Open MMariusch opened 2 weeks ago

MMariusch commented 2 weeks ago

Description

When a base class implements IXmlSerializable then XmlSerializer Deserializing tries to create an instance of base class and calls it's virtual ReadXml instead of the overriden ReadXml in the derived class.

Reproduction Steps

In the following example the deserializer is trying to make an instance of the base class instead of the derived one.

Repository with an example: https://github.com/MMariusch/Example

The code:

public class TestClass
{
    [XmlElement("AClass", typeof(AClass))]
    List<BaseClass> objects = new List<BaseClass>();

    public TestClass(){}

    public void SaveToFile()
    {
        objects.Add(new AClass("someString"));
        objects.Add(new AClass("anotherString"));
        var xmlSerializer = new XmlSerializer(typeof(List<BaseClass>));
        using (var writer = new StreamWriter("SaveList.xml"))
        {
            xmlSerializer.Serialize(writer, objects);
        }
    }

    public void LoadFromFile()
    {
        if (File.Exists("SaveList.xml"))
        {
            using (var reader = new StreamReader("SaveList.xml"))
            {
                var deserializedList = new XmlSerializer(typeof(List<BaseClass>)).Deserialize(reader) as List<BaseClass>;
                if (deserializedList != null && deserializedList.Count > 0)
                {
                    objects = deserializedList;
                }
            }
        }
    }
}

[XmlInclude(typeof(AClass))]
public class BaseClass : IXmlSerializable
{
    public BaseClass() { }
    public XmlSchema GetSchema() { return null; }
    public virtual void ReadXml(XmlReader reader) 
    {
        Console.Write("It shouldn't be triggered.");
    }
    public virtual void WriteXml(XmlWriter writer) { }
}

public class AClass : BaseClass
{
    private string _stringVar;
    public string StringVar { get => _stringVar; private set => _stringVar = value; }
    public AClass() { }
    public AClass(string stringVar)
    {
        _stringVar = stringVar;
    }

    public override void ReadXml(XmlReader reader)
    {
        reader.MoveToContent();
        var anyElements = !reader.IsEmptyElement;
        reader.ReadStartElement();
        if (anyElements)
        {
            _stringVar = reader.ReadElementContentAsString("StringVar", "");
            reader.ReadEndElement();
        }
    }

    public override void WriteXml(XmlWriter writer)
    {
        writer.WriteAttributeString("xsi", "type", null, "AClass");
        writer.WriteElementString("StringVar", _stringVar);
    }
}

Expected behavior

Should create an instance of the derived class.

Actual behavior

Creates an instance of the base class.

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

KalleOlaviNiemitalo commented 1 week ago

The [XmlElement("AClass", typeof(AClass))] attribute on the TestClass.objects field is ignored by XmlSerializer because it is not serializing or deserializing a TestClass instance.

The generated SaveList.xml is:

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfBaseClass xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <BaseClass xsi:type="AClass">
    <StringVar>someString</StringVar>
  </BaseClass>
  <BaseClass xsi:type="AClass">
    <StringVar>anotherString</StringVar>
  </BaseClass>
</ArrayOfBaseClass>

If BaseClass did not implement IXmlSerializable, then XmlSerializer.Deserialize would recognize xsi:type="AClass" and create an instance of AClass. But now when BaseClass implements IXmlSerializable, XmlSerializer ignores the xsi:type attribute.

Analysis

The behaviour seems to be the same in .NET 8.0 and in .NET Framework 4.8.

XmlSerializer in .NET Framework still includes the legacy C# code generator. That likewise has the same behaviour. The generated C# code can be debugged with this app.config:

<configuration>
  <system.xml.serialization>
    <xmlSerializer useLegacySerializerGeneration="true" />
  </system.xml.serialization>
  <system.diagnostics>
    <switches>
      <add name="XmlSerialization.Compilation" value="true" />
    </switches>
  </system.diagnostics>
</configuration>

The deserializer code generators have branches specifically for checking the xsi:type attribute when deserializing a type that implements IXmlSerializable:

Those do not trigger with your sample code; SerializableMapping.DerivedMappings is apparently null.

Workaround (or intended use?)

If you apply XmlSchemaProviderAttribute, then the deserializer checks the xsi:type attribute and creates an instance of AClass:

[XmlSchemaProvider(nameof(GetSchema))]
public partial class BaseClass : IXmlSerializable
{
    internal static XmlSchema Schema { get; }
        = new XmlSchema()
    {
        Items =
        {
            new XmlSchemaComplexType()
            {
                Name = "BaseClass",
            },
            new XmlSchemaComplexType()
            {
                Name = "AClass",
                ContentModel = new XmlSchemaComplexContent()
                {
                    Content = new XmlSchemaComplexContentExtension()
                    {
                        BaseTypeName = new XmlQualifiedName("BaseClass"),
                        Attributes =
                        {
                            new XmlSchemaAttribute()
                            {
                                Name = "StringVar",
                                SchemaTypeName = new XmlQualifiedName("string", ns: XmlSchema.Namespace),
                            },
                        },
                    },
                },
            },
        },
    };

    public static XmlQualifiedName GetSchema(XmlSchemaSet xs)
    {
        xs.Add(BaseClass.Schema);
        return new XmlQualifiedName("BaseClass");
    }
}

[XmlSchemaProvider(nameof(GetSchema))]
public partial class AClass : BaseClass
{
    public static new XmlQualifiedName GetSchema(XmlSchemaSet xs)
    {
        xs.Add(BaseClass.Schema);
        return new XmlQualifiedName("AClass");
    }
}

I don't know how much detail can be omitted from the XML schema.

MMariusch commented 1 week ago

Thank you for the answer. If I understand correctly, if I want to add another derived class then I need to modify the scheme. I made something like this:

internal static XmlSchema Schema { get; } = new XmlSchema()
{
  Items = 
  { 
    new XmlSchemaComplexType() { Name = "BaseClass" },
    new XmlSchemaComplexType() 
    {
      Name = "AClass", ContentModel = new XmlSchemaComplexContent()
      {
        Content = new XmlSchemaComplexContentExtension()
        {
          BaseTypeName = new XmlQualifiedName("BaseClass"), Attributes =
          {
            new XmlSchemaAttribute() 
            { Name = "AStringVar", SchemaTypeName = new XmlQualifiedName("string", ns: XmlSchema.Namespace), },
          },
        },
      },
    },
    new XmlSchemaComplexType() 
    { 
      Name = "BClass", ContentModel = new XmlSchemaComplexContent()
      { 
        Content = new XmlSchemaComplexContentExtension() 
        {
          BaseTypeName = new XmlQualifiedName("BaseClass"), Attributes = 
          { 
            new XmlSchemaAttribute()
            { Name = "BStringVar", SchemaTypeName = new XmlQualifiedName("string", ns: XmlSchema.Namespace), }, 
          },
        },
      },
    },
  }
};

Unfortunately this doesn't work. The deserializer skips BClass.

KalleOlaviNiemitalo commented 1 week ago

Please show how you define BClass and what goes into the XML file.

MMariusch commented 1 week ago

Oops. I forgot to add the XmlInclude attribute. Everything works fine. Thank you.

Getting IXmlSerializable to work takes a lot of work. That's why I'm wondering whether this shouldn't be treated as an bug.

KalleOlaviNiemitalo commented 1 week ago

The documentation could be improved for sure. The IXmlSerializable documentation explains how to provide a schema but does not mention that XmlSerializer requires a schema for deserialising instances of a derived class. I think a note about this would be useful in the XmlIncludeAttribute documentation as well.

It makes some sense that, by implementing IXmlSerializable, the developer also takes responsibility of choosing the XmlQualifiedName for the xsi:type attribute. But perhaps XmlSerializer could be changed to require only this XmlQualifiedName and not an entire XmlSchema.

MMariusch commented 1 week ago

You are right. An improved documentation would be also really helpful. But personally I would prefer to keep modifications of the base class to a minimum.

Currently I've encountered another problem. I changed the BaseClass into an abstract class and updated the XmlSchemaComplexType of "BaseClass" by setting IsAbstract flag to true. Unfortunately I get an error because deserializer tries to create an instance of BaseClass instead of the derived class.

I updated my project on repo: https://github.com/MMariusch/Example