dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.05k stars 4.69k forks source link

[API Proposal]: Retrofit XmlNodeList to implement IReadOnlyList #101496

Open jhudsoncedaron opened 5 months ago

jhudsoncedaron commented 5 months ago

Background and motivation

I find myself in an interesting place. When operating Xml documents that are alternately edited with a text editor and a graphical component; and manual intentation must be preserved, the old System.Xml.XmlDocument library is more fit for purpose than the newer System.Xml.Linq.XDocument.

The worst usability issue in the XmlDocument library is the fact that XmlNodeList is not generic. This turns out to be a fixable problem.

API Proposal

// Note that this isn't nullable despite this[int] being nullable. Read the docs on when that returns null.
public partial class XmlList : IReadOnlyList<XmlNode> {
    // Strictly speaking, this method isn't necessary, but I found it improves readability so much it's worth it.
    // I wouldn't do this if I didn't know it would JIT away to nothing.
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public IReadOnlyList<XmlNode> AsReadOnlyList() => this;

    int IReadOnlyList<XmlNode>.Count => Count;
    XmlNode IReadOnlyList<XmlNode>.this[int i] => this[i] ?? throw new IndexOutOfRangeException();
    IEnumerator<XmlNode> IReadOnlyList<XmlNode>.GetEnumerator();
}

GetEnumerator requires a real implementation; the Linq optimization doesn't work here because XmlNodeList already implements IDisposable. It might be as simple as this:

public partial class XmlList : IReadOnlyList<XmlNode> {
    IEnumerator<XmlNode> IReadOnlyList<XmlNode>.GetEnumerator() => new Enumerator(this);

    private sealed class Enumerator : IEnuerator<T> {
        private readonly XmlNodeList list;
        private int index = -1;

        internal Enuemerator((XmlNodeList owner) => list = owner;

       public bool MoveNext() => ++index < Count;
       public XmlNode Current => list[index] ?? throw new InvalidOperationException();
       object IEnumerator.Current => Current;
       public void Rewind() { index = -1; }
       public void Dispose() {}
    }
}

API Usage

The gain is twofold.

1)

    foreach (var node in xdoc.SelectNodes("//xpathexpression", namespaces).AsReadOnlyList()) {
    }

which is much better than

    foreach (var node in xdoc.SelectNodes("//xpathexpression", namespaces).Cast<XmlNode>()) {
    }

because all the runtime casts disappear. I could have done almost as good with an extension method; but the second benefit not so much.

2) As XmlNodeList now implements IReadOnlyCollection it can just be passed to Linq methods and everything that uses generic interfaces works normally. For example:

    foreach (var node in xdoc.SelectNodes("//xpathexpression", namespaces)
          .Where((x) => x.Attributes["typeid"] = "DT"
          .Select((x) => DateTime.Parse(x.InnerText)) {
    }

Alternative Designs

I could do this with extension methods but the allocation penalty is actually rather bad. Allocating an enumerator shim is cheap; failing the reference equality test after unwrapping two layers of Linq and having to enumerate both lists to check if they're identical is not.

Risks

I suppose it's possible that somebody took a dependency on XmlNodeList doesn't implement IReadOnlyList or IEnumerable. I can't imagine what such code would look like where this isn't some unit test designed to prove it doesn't.

dotnet-policy-service[bot] commented 5 months ago

Tagging subscribers to this area: @dotnet/area-system-xml See info in area-owners.md if you want to be subscribed.

teo-tsirpanis commented 5 months ago

This might not be accepted according to the contribution bar of System.Xml.

jhudsoncedaron commented 5 months ago

@teo-tsirpanis : It might not be; but it ought to have been done long ago.

All pre-NET Framework 2.0 collections should be retrofitted to implement generic collections (unless they would be collections of object), no matter where they are. In this case, the corresponding generic collection was added in .NET Framework 4.5 (IList makes no sense here, but IReadOnlyList makes sense).