mamift / LinqToXsdCore

LinqToXsd ported to .NET Core (targets .NET Standard 2 for generated code and .NET Core 3.1, .NET 5+ for the code generator CLI tool).
Microsoft Public License
41 stars 15 forks source link

Support for XSD nillable and xsi:nil="true" #60

Closed jods4 closed 5 months ago

jods4 commented 5 months ago

Fixes #54

This is a larger change than I expected, here's a description of the changes:

Goal is to support nillable="true" in XSD and xsi:nil="true" in XML. When an element has xsi:nil="true" then the exposed typed value is null (applies to reference types, nullable value types, and typed elements).

XSD handling

Parsing XSD nillable attribute was already there!

Metadata in ClrPropertyInfo has been extended:

[!NOTE] Any combination is acceptable. An element can be optional but not nillable, required and nillable, or optional and nillable. In all three instances, the apparent CLR type is nullable. As we shall see, when writing XML linqtoxsd gives preference to xsi:nil when allowed.

Code generation

There are four main situations to consider: scalar vs lists (repeated elements); get and set.

A new test case has been added that covers all cases, check this file for the generated C#: https://github.com/mamift/LinqToXsdCore/blob/662fae09ec17ade84ed4c9bd1b6d41fd3bd757b2/LinqToXsd.Schemas/Tests/Nil/NilTest.xsd.cs

Note that I've also updated the XML documentation comments. Occurence includes now a nillable keyword. "Regular expression" (not quite) indicates nillable elements with a <nil> suffix.

/// <summary>
/// <para>
/// Occurrence: optional, nillable, repeating
/// </para>
/// <para>
/// Regular expression: (RequiredRef<nil>, RequiredVal<nil>, RequiredEl<nil>, OptionalRef<nil>?, OptionalVal<nil>?, OptionalEl<nil>?, ListRef<nil>*, ListVal<nil>*, ListEl<nil>*)
/// </para>
/// </summary>

Scalars

Reading a scalar element is quite simple. An additional step has been added to check for xsi:nil and returns null when present.

get
{
  var x = this.GetElement(ElementXName);
  // This condition is generated when element is optional
  if (x == null) return null;
  // This condition is generated when element is nillable (new)
  if (x.IsXsiNil()) return null;
  // Process the non-null value
  return XTypedServices.ParseValue(x);
}

Writing support is mostly a runtime thing. If the property is nillable, the call to SetElement and co. is not generated with value but rather with value ?? XNil.Value. XNil.Value is a well-known singleton object that indicates to the runtime that we want to create an element with xsi:nil="true". Many changes have been made to ensure that XNil.Value was handled in every code path. This approach means that when both approaches are possible, xsi:nil is generated instead of removing the element. This is simpler for the code generation and also the only way to insert nulls in lists as we shall see in the next section.

Lists (repeated elements)

There is a very interesting consequence of xsi:nil for lists. Previously, lists never used nullable CLR types: missing elements where simply represented by an empty list. Now with xsi:nil lists themselves are still non-nullable (rather: they might be empty) but they may contain null values!

<!-- Contains <A> tags: empty list [ ] -->
<ListA></ListA> 
<!--Contains <B> tags: not empty [ null ] -->
<ListB>
  <B xsi:nil="true" />
</ListB>

So the key change in code generation is that repeated nillable elements generate List<Element?> properties (but optional elements do not).

The rest of the support happens at runtime in XList and its derived classes. XList has a new SupportsXsiNil boolean property that is initialized by codegen to ensure the returned list accepts null elements (when SupportsXsiNil is left to its default false value, then passing null values to XList methods throws, like it does today). This property is set with an initializer new XList() { SupportsXsiNil = true }. I did not add it to the ctor because it was quite tricky as derived classes XSimpleList, XTypedList and XTypedSubstitutedList take variable number of arguments and sometimes params.

Runtime

Most xsi:nil helpers have been put in the new XNil class.

XList and its 3 derived classes have been largely rewritten so that when SupportXsiNil = true, they can contain and operate on null items. null CLR items are translated into <Element xsi:nil="true"> (and vice-versa).

All the core XObject methods that are involved in setting element values are modified to recognize the singleton object XNil.Value. This indicates that an xsi:nil must be set on target element. When null is passed to those methods instead they work as before by removing the element (which is still the mode of operation of properties that are not nillable).

When setting a non-null value, we must always remove xsi:nil, in case it was set before.

[!NOTE] Technically, an XML element could have xsi:nil="true" and children or attributes. This does not make much sense and simply maps to null in CLR. Setting a nillable element to null adds xsi:nil="true" and remove any existing children or attributes.

No special care is given to declare the xsi namespace. So by default it turns out to be a local declaration on every nil element like:

<document>
  <element xmlns:p0="http://www.w3.org/2001/XMLSchema-instance" p0:nil="true" />
</document>

I thought of adding this declaration at the root of documents that may contain nillable elements, but it turns out that it's not easy to know when an element might be a root when manipulated by user code, nor to find a convenient place to modify the XElement. So I decided the XML was valid and left it like that.

[!TIP] Users that want a "cleaner" document can easily tweak that themselves. Just add: element.Untyped.Add(new XAttribute(XNamespace.Xmlns + "xsi", "http://www.w3.org/2001/XMLSchema-instance")); on the root element where you want xsi declared and XElement serialization will reuse that. This is what I did in my unit tests for example (look at XsiNilTests.cs).

jods4 commented 5 months ago

@mamift I see the CI build fails because I used a raw string literal in my added tests and it's a preview feature in the old CI dotnet SDK (v6 I believe).

That's quite convenient to write multi-line XML do you think you can upgrade the CI SDK to a newer release (v8?). It would be beneficial to have access to all new language features.

Otherwise we could enable Preview C# features in tests csproj; or rewrite the strings as verbatim or regular strings.

mamift commented 5 months ago

OK so it seems upgrading to the .NET 8 SDK was a bit more complicated than just increasing the version number. But I have got the Test project to build and run by setting lang version to preview.

jods4 commented 5 months ago

Thanks @mamift ! Do you have a release planned?

mamift commented 5 months ago

Should be up now: https://www.nuget.org/packages/XObjectsCore/3.4.3

jods4 commented 5 months ago

Awesome thanks! 🎉