dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.06k stars 4.69k forks source link

[API Proposal]: allow to escape invalid xml characters during xml serialization #104362

Open denbell5 opened 3 months ago

denbell5 commented 3 months ago

Background and motivation

In .NET Framework, XmlSerializer by default escapes characters like \u0001 when it writes non-attribute text (<Value>example\u00001example</Value>).

In .NET6 the code throws exception ArgumentException: '', hexadecimal value 0x01, is an invalid character. (the exception starts from here).

I could not find any option to conveniently intercept the process of writing to escape those characters because XmlEncodedRawTextWriter is internal and it is not possible to extend it. XmlTextWriter, that is used in .NET Framework is in System.Private.Xml library which is not accessible unless I am missing something.

API Proposal

namespace System.Xml
{
    //
    // Summary:
    //     Specifies a set of features to support on the System.Xml.XmlWriter object created
    //     by the Overload:System.Xml.XmlWriter.Create method.
    public sealed class XmlWriterSettings
    {
               //
               // Summary:
               //     Gets or sets a value indicating whether to escape invalid characters when writing non-attribute text.
               //
               // Returns:
               //     true to escape invalid characters when writing non-attribute text; otherwise, false. The default is
               //     false.
               public bool EscapeInvalidCharacters { get; set; }
        }
}  

API Usage

var xmlWriter = XmlWriter.Create(stringWriter, new XmlWriterSettings
{
    EscapeInvalidCharacters = true
});

Alternative Designs

No response

Risks

No response

dotnet-policy-service[bot] commented 3 months ago

Tagging subscribers to this area: @dotnet/area-system-xml See info in area-owners.md if you want to be subscribed.