Board Review: Support for JSON Merge Patch

mikekistler commented 3 years ago

The recently updated Azure API Guidelines recommend that update operations be implement using the PATCH method and specifically with JSON Merge Patch. The preference for JSON Merge Patch is that it is naturally idempotent so no special support is required to ensure idempotency. JSON Merge Patch is also superior to PUT because PUT from a client that is not at the latest api-version can remove properties that were added in later api-versions.

But JSON Merge Patch, as defined in RFC 7386, has special semantics, particularly for values passed as "null".

Null values in the merge patch are given special meaning to indicate the removal of existing values in the target.

This is a subtle detail that is often overlooked, so it is worth taking a look at this to see if we have any gaps. There are at least 3 aspects to consider:

1) Server-side implementation

As many Azure services are implemented in ASP.NET Core, we should make sure there is proper support for JSON Merge-Patch provided in this framework and that it is well documented and easy to use.

2) Client-side support

Do our current generated (or hand-written) clients properly support passing "null" explicitly for a property value to an update operation that uses JSON Merge Patch? Apparently some languages do support this:  

Go: has a NullValue() function that returns a reference to a T. This reference is to a singleton instance of T (our sentinel). When this particular T instance is serialized Go emits ‘null’. This works in Go because all properties are defined as reference types.
TypeScript/JavaScript: JavaScript supports this naturally because "null" is a distinct value from "undefined".
Java: Java does not have support for nulls in JSON Merge-Patch but the team has done some investigation on how it might be done:  

https://gist.github.com/anuchandy/ca914f4a8dfaa49ea8e082de7e619c8f

The other languages may have support for this but I'm just not aware of it.

For languages that do have this support, does our testserver verify that it works correctly?

3) Client-side documentation

Even when our client libraries support passing an explicit "null", users need to know about it in order to use it properly. Does our SDK documentation describe this feature sufficiently and is that documentation easily discoverable by users?

lilyjma commented 3 years ago

scheduled for 11/4

tg-msft commented 3 years ago

I see two ways to do this in C#.

An `Optional<T>` approach

As called out above, this really comes down to a standard way of representing undefined, null, and values across languages. Unfortunately languages like C# with separate value and reference types have a harder problem representing a PATCH API in their client libraries. If I declare public int Age { get; set; }, I can't assign either null or undefined. I could make it nullable with public int? Age { get; set; } which would allow me to assign null, but Nullable<T> is still a value type and I can't create a magic sentinel instance to represent undefined.

We would have to add our own Optional<T> that worked kind of like Nullable<T> to wrap patchable value types. It isn't great for several reasons. You'll have to unwrap the Optional and then Nullable to get a value (or we merge Optional and Nullable together which is probably even more confusing to C# devs). If we implement Optional ourselves, it won't have the same language conveniences like operator lifting (i.e., in C# null + 2 == null), though one could argue maybe we shouldn't have those semantics anyway for undefined.

Please see https://github.com/Azure/azure-sdk/issues/1566 for a detailed proposal of how this might work - including a strawperson implementation and cross-references to related discussions.

Here's a quick example of what a model definition might look like:

public class Employee
{
    public int? Id { get; set; }
    public Optional<string> Name { get; set; }
    public Optional<int?> Age { get; set; }
    public Optional<string> Title { get; set; }
}

And here's how we'd use it:

// Create a new employee
var e = new Employee { Name = "Bob", Age = 42 };
e = client.CreateEmployee(e);
// sent { "name": "Bob", "age": 42 } over the wire
// received back { "id": 17, "name": "Bob", "age": 42 } from the service

// Lookup an employee
e = client.GetEmployee(e.Id);
// received back { "id": 17, "name": "Bob", "age": 42 } from the service
Console.Write($"Employee {e.Id}")
if (e.Name.HasValue) { Console.Write($" name: {e.Name.Value}"); }
if (e.Age.HasValue) { Console.Write($" age: {e.Age.Value.Value}"); } // Ewwwww
if (e.Title.HasValue) { Console.Write($" title: {e.Title.Value}"); }
Console.WriteLine();

// Change JUST the title
e = new Employee { Id = e.Id, Title = "Bob II" };
e = client.UpdateEmployee(e);
// sent { "id": 17, "title": "Bob II" } over the wire
// received back { "id": 17, "name": "Bob", "age": 42, "title": "Bob II" } from the service

Regular Models + LLC

Given that the API Stewardship Board is pushing idempotent PATCH operations for more and more scenarios, I worry that adding Optional<Nullable<int>> types is going to add a nontrivial conceptual overhead to otherwise simple champion scenarios. Instead, I'd like to keep using regular models. Imagine a model like:

public class Employee
{
    public int? Id { get; set; }
    public string Name { get; set; }
    public int? Age { get; set; }
    public string Title { get; set; }
}

And we could use it like:

// Create a new employee
var e = new Employee { Name = "Bob", Age = 42 };
e = client.CreateEmployee(e);
// sent { "name": "Bob", "age": 42, "title": null } over the wire
// received back { "id": 17, "name": "Bob", "age": 42 } from the service

// Lookup an employee
e = client.GetEmployee(e.Id);
// received back { "id": 17, "name": "Bob", "age": 42 } from the service
Console.Write($"Employee {e.Id}")
if (e.Name != null) { Console.Write($" name: {e.Name}"); }
if (e.Age != null) { Console.Write($" age: {e.Age.Value}"); }
if (e.Title != null) { Console.Write($" title: {e.Title}"); }
Console.WriteLine();

// Change the title (and potentially everything else)
e.Title = "Bob II";
e = client.UpdateEmployee(e);
// sent { "id": 17, "name": "Bob", "age": 42, "title": "Bob II" } over the wire
// received back { "id": 17, "name": "Bob", "age": 42, "title": "Bob II" } from the service

I think all of these semantics still make sense for the champion scenarios. I don't really care that Title == null means "clear any existing Title" so much as I want an Update call to make the service resource look exactly like the model I'm manipulating. While the Update scenario could stomp over other values if someone else had updated Age, services should still be using conditional requests even for PATCH updates to make this safe.

Further, if someone really wanted to update just the title, they could do so using our LLC overload:

client.UpdateEmployee(id: 17, body: RequestContent.Create(new { title = "Bob II" }));

Customers would have to know more about the shape of the wire protocol to do this (and it might not always be as simple as changing the case), but that's a problem we could solve with documentation/samples if we found a strong desire for patching individual properties.

KrzysztofCwalina commented 3 years ago

To avoid the "stomp" problem described above, and to not have to require protocol methods, our patchable models could provide the following APIs:

public class Employee
{
    public int? Id { get; set; }
    public string Name { get; set; }
    public int? Age { get; set; }
    public string Title { get; set; }

    public Employee CreatePatch();
    public static explicit operator RequestContent(Employee value);
}

The usage would be:

e = client.GetEmployee(e.Id);
e = e.CreatePatch();
e.Title = "Bob II";
e = client.UpdateEmployee(e); // this casts e to RequestContent. The cast knows how to create JSON Patch payload
// sent { "title": "Bob II" } over the wire
// received back { "id": 17, "name": "Bob", "age": 42, "title": "Bob II" } from the service

Possibly we could return "patchable" values, i.e. return value of e.CreatePath, from Get methods. In this case users would simply do:

e = client.GetEmployee(e.Id); // this returns patchable value, i.e. returns e.CreatePatch() instead of returning e.
e.Title = "Bob II";
e = client.UpdateEmployee(e); // this casts e to RequestContent. The cast knows how to create JSON Patch payload
// sent { "title": "Bob II" } over the wire
// received back { "id": 17, "name": "Bob", "age": 42, "title": "Bob II" } from the service

The main disadvantage of this idea (of "a patchable model") is that it has perf overhead, i.e. each model has to have an additional field that would store the original.

tg-msft commented 3 years ago

Recording[MS INTERNAL ONLY]

mikekistler commented 3 years ago

@lilyjma Could you please schedule a follow-on meeting for this topic? Thx!

johanste commented 2 years ago

One thing/requirement that is missing from the examples above is the ability to update without doing a GET/read first.

lilyjma commented 2 years ago

second discussion scheduled for 11/30

KrzysztofCwalina commented 2 years ago

I experimented with the idea of "smart" models. Here is the result of the experiment: https://gist.github.com/KrzysztofCwalina/d5c1fba32eb4548f14bbf27184401731

I think smart models add a lot of complexity. Because of that, I would suggest we start with something like untyped JsonPatchBuilder, which users can use to create patch payloads stored in BinaryData or RequestContent, which in turn can be passed to "update" methods. We can think long term about simpler "smart models" if users complain.

tg-msft commented 2 years ago

Recording[MS INTERNAL ONLY]

mikekistler commented 2 years ago

Summary

Java and .NET

Will implement a PatchBuilder.

PatchBuilder will be implemented in core.
PatchBuilder will accept name / values and "build" a RequestContent.
- Names can be "dotted" to set nested fields arbitrarily deep
- Builder methods will allow fields to be "added" or "removed" (set to null)
- "build" might be generic to allow checking that patch structure conforms to a particular class
The "update" method (that calls patch) accepts a Request Content.
https://github.com/Azure/azure-sdk-for-net/issues/27402
https://github.com/Azure/azure-sdk-for-java/issues/27510

Javascript

JavaScript has native support for "null" so it is covered.

Python

Python will continue to use it's special sentinel to signify "null"

https://azuresdkdocs.blob.core.windows.net/$web/python/azure-core/1.20.0/azure.core.html#azure.core.serialization.NULL

Go

Go will continue to use it's special sentinel to signify "null"

// NullValue is used to send an explicit 'null' within a request.
// This is typically used in JSON-MERGE-PATCH operations to delete a value.
func NullValue(v interface{}) interface{} {

Swift ???

Follow-on items

Documentation
Data Plane Codegen support
Test-server test
Update SDK guidelines

HuangXin-MS commented 2 years ago

RFC 7386 is obsoleted by https://www.rfc-editor.org/rfc/rfc7396

github-actions[bot] commented 8 months ago

Hi @mikekistler, we deeply appreciate your input into this project. Regrettably, this issue has remained inactive for over 2 years, leading us to the decision to close it. We've implemented this policy to maintain the relevance of our issue queue and facilitate easier navigation for new contributors. If you still believe this topic requires attention, please feel free to create a new issue, referencing this one. Thank you for your understanding and ongoing support.

Azure / azure-sdk