Closed stevejgordon closed 2 years ago
System.Text.Json has problems with how it does not serialize/deserialize an entire object graph when the types are inherited. So if you have
public class A
{
public string BaseClassField {get;set;}
}
public class B : A
{
public string DerivedField {get;set;}
}
var newb = new B();
b.BaseClassField = "abc";
b.DerivedField = "def";
and then ask S.T.J to serialize newb
you will get something like
{
"DerivedField": "def"
}
... and a lot of your users are not going to want that and it would be a breaking change, so be aware of that.
Can getting either .net client to work in a DI container scenario less clunky? So many of the overloads use the default index, but that can only be set on connection settings, which should be getting created by the DI container but that wont always know the index to use at that point.
Also ConnectionSettings cant be serialized or deserialized, so I cant just create a configuration key/value for "ElasticSearchConnectionInfo" and put the serialized connection settings in it so I can load them. later. Instead I have to use individual config settings and a bunch of
if (configSettingForDisableDirectStreaming == true)
{
connSettings = connSettings.DisableDiectStreaming();
}
There are a bunch of other things that are problematic with using this API and how it follows Lucene or Elastic conventions instead of .net conventions. Its supposed to be a .net API. The audience is .net developers, and to most of us ES and Lucene conventions are not well known or are contrary to the conventions we are used to. Instead of making elastic search more accessible to .net developers these packages maintain that same higher knowledge barrier of entry - they just move it from ES to Nest or the LLC. And this isnt counting the other oddities like how SearchDescriptor.Index(Indicies index)
actually accepts a string parameter of the name of the index, and not whatever type Indicies
is.
@stevejgordon if you didn't fix this, then you still have non-user friendly client package
Another thing is the difficulty with unit testing due to all the internal setters in POCO payloads like DynamicResponse for a MultiSearch. Instead of being able to create a payload of that type that satisfies a unit test's needs, I can construct most of it by passing in a DynamicDictionary with several nested dictionaries, but I cannot set the Success property on the response, or on the APICall.Success property. The test code I end up writing looks like this, and boy is it ever a pain to get right.
LMK if you want this moved to discussion or somewhere else. I couldnt tell if you wanted responses right on this thread or where you wanted them.
It has been quite some time since last August. Is there a reason that there are no published NuGet packages, but only tags in the repo?
Hi @smg-bg. The new client is coming along and should enter beta very soon. As part of this release, we are renaming the client and the package. The alphas are available on Nuget and we welcome any early feedback.
@stevejgordon
Hi Steve, what is the best way of mocking out the client in 8.x? I'm on beta 5.
It seems a lot of interfaces were removed.
We used to create a a mock of IElasticClient
and setting it up to return specific responses.
Now our classes depend on the ElasticsearchClient
class. Can't replace it with a fake implementation anymore.
I don't want to go as low as replacing the transport and playing with inline json.
Any plans for reintroducing interface for the client?
Basically want to test some logic in a class but get Elasticsearch out of the equation.
Hi, @michael-budnik.
There are no plans to reintroduce the removed interfaces in the v8 design.
Ultimately I don't believe these interfaces serve a purpose and they generally get misused for convenient testing. As we only ever expect there to be a single concrete implementation of the client, I personally feel an interface is an unneccesary abstraction. It also technically means that each addition to the client is breaking (should anyone be implementing the interface) without falling over to default interface implementations. I took the decision to remove the ambiguity they introduce given the extensive changes and code-gen work for this new client. In the redesign, we avoided the internal requirement that some of the interfaces were helping solve, further making them redundant.
I don't feel it should be the responsibility of the client to provide abstractions for the purpose of testing consumer code. Mocking frameworks tend to make this super convenient but it bloats the assembly and potentially catches people out if the interface has to change to match the evolving implementation.
What I would advocate for in your scenario is introducing your own abstraction in the form an an interface and an implementation which forwards onto the ElasticsearchClient
. Your interface can act as a simplified facade of the APIs your application actually uses ont the client. At this point your code can depend on your abstraction and not be coupled directly to our library. You can mock your abstraction as neccessary for testing.
public interface ISearchClient // abstraction limited to the methods actually in use by dependants
{
Task<SearchResponse<TDocument>> SearchAsync<TDocument>(Action<SearchRequestDescriptor<TDocument>> configureRequest, CancellationToken cancellationToken = default);
}
public class ElasticsearchSearchClient : ISearchClient // basic wrapper implementation
{
public ElasticsearchSearchClient(ElasticsearchClient client) => Client = client;
public ElasticsearchClient Client { get; }
public Task<SearchResponse<TDocument>> SearchAsync<TDocument>(Action<SearchRequestDescriptor<TDocument>> configureRequest, CancellationToken cancellationToken = default) =>
Client.SearchAsync(configureRequest, cancellationToken);
}
You may even want to avoid using our request (or descriptor) and response types in your abstraction and map those in your implementation. That further decouples you from changes by encapsulating our client and your implementation can also handle more complex application-level decisions like exception handling, retries etc.
For pure unit testing, the above should avoid the need to get as low as the transport interfaces and use of InMemoryConnection
etc. In the cases where you must call the client directly and have no layer of indirection in between depending on an ElasticsearchClient with an InMemoryConnection is a viable alternative for unit testing.
For other advanced scenarios, full integration testing is a further consideration.
If there are specific examples of where unit testing is extremely difficult, even with the above approaches in mind, I'm happy to review those on a case-by-case basis. I also plan to document some example scenarios in more detail once the v8 client work is completed.
What exactly get misused for convenient testing
means?
Isn't testability one of the expected elements of a library?
I'm following your advice, trying to migrate hundreds of unit tests. We have a suite of integration tests as well but they're complementary.
It seems I can't even create a proper instance of any responses, as IsValid
is settable through reflection/json deserialisation only!? Yes, I could create my own responses but then would need to maintain mappings.
Is it how classes should be designed?
I feel like I'm wasting a lot of time on something that should be more... convenient :)
Isn't testability one of the expected elements of a library?
A library should be tested by its authors. For consuming code, yes, it should be possible to test, but I don't believe it should be the job of the library to design everything for that case. I would still advocate that consumers use their own abstractions and a thin tested integration layer to avoid depending on our types where it complicates testing.
It is possible to create request and response instances if required for tests (see below for an example), which should be sufficient for most cases. We provide the InMemoryConnection as an alternative approach to testing around responses from the client. I feel these options provide enough mechanisms for testing, but I'm happy to take examples that are difficult so we can learn as well as better document scenarios developers face.
What exactly get misused for convenient testing means?
Adding interfaces for many more types bloats the assembly, and if those interfaces only exist for convenient testing, I believe that's not a valid use. A lot of the .NET BCL doesn't provide interfaces for types. Interfaces exist when there are (or could be) many implementations of that abstraction.
In the past, I believe we have seen people stubbing or mocking our interfaces, who are then frustrated when we have to add new members. I also believe we've seen consumers with tests re-testing our implementation and not their own.
Is it how classes should be designed?
In my opinion, yes, classes should encapsulate behaviour and expose state appropriately. IsValid is a calculated value and may be overridden by response types. Responses should be read-only representations of the server response and payload. Making properties settable purely for external testing doesn't sit well with me.
I accept that having to maintain duplicates of each response and a mapping between those types and ours is probably too big an ask for most consumers. However, in an ideal world, it provides better separation from our implementation. Once completed, the mapping behaviour can be tested pretty easily using InMemoryConnection. Typically, I wouldn't expect those mappings to need to change all that often. For your specific challenge, if you want to create a response for testing, you can set its properties directly. The challenge, as you encountered, is setting the HTTP API call details. You could achieve that by casting to ITransportResponse
though. e.g.
var response = new SearchResponse<Thing>
{
HitsMetadata = new HitsMetadata<Thing>
{
Hits = new Hit<Thing>[]
{
new() { Source = new() { Id = 1 } },
new() { Source = new() { Id = 2 } }
}
}
};
var r2 = response as ITransportResponse;
r2.ApiCall = new ApiCallDetails { HttpStatusCode = 200, Success = true };
if (response.IsValid)
{
foreach (var doc in response.Documents)
{
Console.WriteLine(doc.Id);
}
}
If you can share some of your test code as examples, I can certainly review that to help understand and improve any rough edges.
You've broken the hell out of our libraries and apps as well. I know the client needed an overhaul to hopefully simplify and improve things, but this is going to cause me an insane amount of work to try and migrate. I'm just trying to migrate this base library (which doesn't even do that much elasticsearch work) first as a test to see how bad it will be and literally everything is broken.
Please don't assume you can control everything and that you know what all the use case scenarios are by sealing every class and making all the properties read only.
I don't feel it should be the responsibility of the client to provide abstractions for the purpose of testing consumer code. Mocking frameworks tend to make this super convenient but it bloats the assembly and potentially catches people out if the interface has to change to match the evolving implementation
@stevejgordon it looks like you are going to catch plenty of people out by dropping the interfaces they need for testing. It's not like we're asking for much here, we're just asking you to not make these clients so difficult to test with and stand up in a DI container.
You know what happens when people can't upgrade to the latest version of your product?
This is also one of the areas where the.net client follows elasticsearch/lucene conventions and not .net conventions. Instead of trying to fix this and make the object model and function arguments more sane you make the library even more difficult for us to use.
This is an RC, so I expect things to work. I spent an hour or so just trying to convert one class which is just a tiny portion of what I would need to upgrade to the new client.
Broken:
MappingResponse
model is empty. It does not contain any of the mapping information.IProperty.LocalMetadata
has been removed. This was important for me and I was the one that originally requested it. Again, this is assuming that you know all of the use cases.IProperty.Name
is gone. Not even sure how I would go about handling this without having to create my own model and populate it as I'm iterating through the mapping properties.Annoyances:
Properties
collection from (used to be on CoreProperty
). So now I will need to know every single property type and do a case to return the Properties
collection on property types that support it.NumberProperty
is gone which was a common base type for all numeric property types.I'm done attempting for now. This is going to be a nightmare.
Hi, @michael-budnik.
There are no plans to reintroduce the removed interfaces in the v8 design.
Ultimately I don't believe these interfaces serve a purpose and they generally get misused for convenient testing. As we only ever expect there to be a single concrete implementation of the client, I personally feel an interface is an unneccesary abstraction. It also technically means that each addition to the client is breaking (should anyone be implementing the interface) without falling over to default interface implementations. I took the decision to remove the ambiguity they introduce given the extensive changes and code-gen work for this new client. In the redesign, we avoided the internal requirement that some of the interfaces were helping solve, further making them redundant.
I don't feel it should be the responsibility of the client to provide abstractions for the purpose of testing consumer code. Mocking frameworks tend to make this super convenient but it bloats the assembly and potentially catches people out if the interface has to change to match the evolving implementation.
What I would advocate for in your scenario is introducing your own abstraction in the form an an interface and an implementation which forwards onto the
ElasticsearchClient
. Your interface can act as a simplified facade of the APIs your application actually uses ont the client. At this point your code can depend on your abstraction and not be coupled directly to our library. You can mock your abstraction as neccessary for testing.public interface ISearchClient // abstraction limited to the methods actually in use by dependants { Task<SearchResponse<TDocument>> SearchAsync<TDocument>(Action<SearchRequestDescriptor<TDocument>> configureRequest, CancellationToken cancellationToken = default); } public class ElasticsearchSearchClient : ISearchClient // basic wrapper implementation { public ElasticsearchSearchClient(ElasticsearchClient client) => Client = client; public ElasticsearchClient Client { get; } public Task<SearchResponse<TDocument>> SearchAsync<TDocument>(Action<SearchRequestDescriptor<TDocument>> configureRequest, CancellationToken cancellationToken = default) => Client.SearchAsync(configureRequest, cancellationToken); }
You may even want to avoid using our request (or descriptor) and response types in your abstraction and map those in your implementation. That further decouples you from changes by encapsulating our client and your implementation can also handle more complex application-level decisions like exception handling, retries etc.
For pure unit testing, the above should avoid the need to get as low as the transport interfaces and use of
InMemoryConnection
etc. In the cases where you must call the client directly and have no layer of indirection in between depending on an ElasticsearchClient with an InMemoryConnection is a viable alternative for unit testing.For other advanced scenarios, full integration testing is a further consideration.
If there are specific examples of where unit testing is extremely difficult, even with the above approaches in mind, I'm happy to review those on a case-by-case basis. I also plan to document some example scenarios in more detail once the v8 client work is completed.
This is just a tiny sample of how you've not made our lives as consumers of this library simpler or better, but much harder because of your perceived gains in protecting yourself from support issues. Your massive breaking changes and barriers are costing people who have invested heavily in this library an incredible amount of cost, effort and pain if we don't want to be stuck using the legacy client forever.
I know you wanted to modernize this library and have all of the models and methods code generated, but IMO, you've taken WAY too much liberty in breaking everything and I don't think you realized the cost you would be inflicting on your users.
@Mpdreamz @stevejgordon It would be great if you could help try to migrate Foundatio.Parsers to see the pain yourself before releasing this. This is a long standing community project and probably the easiest thing to convert.
Hi all. We have read and are currently digesting your feedback in this thread. I will put together a response over the next few days to provide answers and clarifications.
Firstly, I'd like to thank all those who have provided constructive feedback in this thread. It is important for us to be able to understand what our users can and cannot do with the tools we provide. We have also spent some time looking into the projects associated with the contributors to this thread, which has given us further insight.
But please bear in mind that this is a team effort on our side. As such, I would ask that you avoid targeting a particular comment or complaint at an individual engineer, unless you are already involved in a direct discussion with that person.
A big part of our work here is to make conscious decisions on what we do and what we don't do. There will always be use cases that we aren't able to fulfil, and users who disagree with our design choices. This is an unfortunate reality of all software development and is a factor in all decisions we make; and we take those decisions seriously. Our role is therefore to try to achieve a balance that works for the greatest number of users, taking into account as many factors as possible, and choosing trade-offs where necessary.
The first important point to clarify is that version 8 of the client represents a new start. Mature code becomes increasingly hard to maintain over time, and our ability to make timely releases has diminished as code complexity has increased. Major releases give us an opportunity to simplify, as well as to better align our language clients with each other in terms of design. Here, it is crucial to find the right balance between uniformity across programming languages and the idiomatic concerns of each language. For .NET, we will typically compare and contrast with Java and Go to make sure that our approach is equivalent for each of these. We also take heavy inspiration from Microsoft, as well as the conventions of the wider .NET community.
We have intentionally shipped the new code-generated client as a new package with a new root namespace. This new client is built upon the foundations of NEST, but there are changes. By shipping as a new package, the expectation is that migration can be managed in a phased approach.
We have also intentionally limited the feature set for the initial 8.0 release. Over the course of the 8.x releases, we will be able to reintroduce features again. Therefore, if something is missing right now, it may not be omitted permanently. We will endeavour to communicate our plans in this regard in the final release notes, as and when they become available.
Given the size of the Elasticsearch API surface today, it is no longer practical to maintain by hand the thousands of data types this involves. To ensure consistent, accurate and timely alignment between language clients and Elasticsearch, the 8.x clients and all the associated types are now code-generated from a common specification. This is a common solution to maintaining alignment between client and server among SDKs and libraries, such as those for Azure, AWS and the Google Cloud Platform.
Code generation from a specification has inevitably led to some differences between the existing NEST types and those available in the new client. One example is the lack of the convenience member Name
on Property
types. For the 8.0 release we generate strictly from the specification, special casing a few areas to improve usability or to align with language idioms. The base type hierarchy for concepts such as Properties, Aggregations and Queries is no longer present in generated code as these arbitrary groupings do not align with concrete concepts of the public server API. These considerations do not preclude adding syntactic sugar and usability enhancements to types in future releases on a case-by-case basis.
Our user base is vast and diverse. We cannot obviously cater for everyone, nor anticipate all use cases, but we can try to provide tooling and recommendations to as many as possible. The primary intended audience for the client library is application developers, building software within a particular business domain. As such, most of our design choices and optimisations for this library have those users in mind. The lower level transport functionality was split out as a separate package. This is not intended for direct use by application developers, but may be useful for more advanced use cases, such as when building other developer tooling.
With the primary use case in mind, our main concern is to construct a surface that offers functionality to those people. Most of our user base don't care how the classes that make up that surface are implemented; the result for them is exactly the same regardless.
We do not intend to bring back interfaces for the majority of classes, as used by some for mocking. While we do appreciate that this will result in extra work for some people, this is not something we provide in Java either. This trade-off helps us to avoid bloating the assembly, which in turn lets us make releases in a more timely fashion.
We include mechanisms for testing application code as part of the library. The transport layer includes an InMemoryConnection
. We recommend that tests use that connection to "mock" the JSON responses rather than attempt to mock client library types. This ensures that all behaviour of the client is maintained and can be relied upon to be accurate to the actual implementation during testing. This approach is similar to the recommendations for testing Microsoft Entity Framework Core and other database drivers. For more through integration testing, consider using the Elastic.Elasticsearch.Xunit and abstraction libraries to spin up a cluster as part of your tests.
Users who prefer to mock types will need to define their own abstraction(s) with an implementation that wraps the ElasticsearchClient
. Consumer abstractions can then be used in DI and testing scenarios with reduced coupling to our types. It is possible to create request and response types to facilitate testing. These types encapsulate the data and behaviour they represent (responses are read-only) and are not explicitly designed for testing, although as demonstrated above, this is achievable.
For more complete decoupling, we would recommend that consumers define their own types to model the request/response data used in application code. A small, easily-tested mapping layer can then isolate changes to client library types into a small integration layer. Such types can be designed specifically for application requirements with test simplification a part of that design choice if required.
We plan to formalise this guidance and include examples as part of our renewed documentation after GA of the client.
Opinions on "sealing by default" within the .NET ecosystem tend to be quite polarised. While the Microsoft framework design guidelines do not recommend sealing all types by default, doing so seems to be a trend for newer types, based on recent API reviews. An example can be seen in System.Threading.RateLimiting
. Microsoft seal all internal types for potential performance gains and we see benefit in starting with that approach for the Elasticsearch client, even for our public API surface. Our thoughts here align with Jon Skeet who is well-regarded in the C# world and works at Google on the SDK libraries, which seal most classes by default. We have followed the same design philosophy.
While it prevents inheritance and therefore may inhibit a few consumer scenarios, the intent of sealing by default is to avoid the unexpected or invalid extension of types which could inadvertently be broken in the future. That said, sealing is not necessarily a final choice for all types; but it is clearly easier for a future release to unseal a previously-sealed class than vice versa. We can therefore choose to unseal as valid scenarios arise, should we determine that doing so is the best solution for those scenarios. This goes back to our clean-slate concept for this new client.
The 8.0 release does not have feature parity with NEST and focuses on core endpoints more specifically for common CRUD scenarios. Our intent is to reduce the feature gap in subsequent versions. We anticipate that this initial release will best suit new applications and may not be migration-ready for all existing consumers.
We also fully understand that the choice to code-generate a new evolution of the .NET client introduces some significant breaking changes. We consciously took the opportunity to refactor and reconsider historic design choices as part of this release, with a view to limiting further breaking changes going forward.
The 8.0 client is shipped as a new NuGet package which can be installed alongside NEST. For complex migrations, we would anticipate that some consumers may prefer a phased migration with both packages side-by-side. In addition, NEST 7.17.x can continue to be used in compatibility mode with Elasticsearch 8.x until the 8.0 client features align with application requirements and even during the migration phase. We will include detailed release notes explaining the missing features, endpoints, aggregations and queries to guide consumer readiness for migration.
We will continue to prioritise the feature roadmap and code-generation work based on feedback from consumers who may rely on features that are initially unavailable.
The feedback in this issue and our review of users' project code has allowed us to identify several bugs and usability issues which will be reviewed and fixed for the second release candidate. Combined with our own items identified during RC1 testing, a list of primary changes we therefore plan to make in RC2 is as follows:
QueryContainer
to simplify assignment.SortCombination
union is hard to work with. We will simplify the specification in this instance to remove the union and only expose SortOptions
, to align with the approach in the Java client.TermsExclude
which is generated based on the specification but is more difficult to use than the existing NEST type.string[]
to IEnumerable<Field>
.Properties
implementation of IDictionary
to support TryGetValue
etc.Refresh
method on the client accepting just the index name. NOTE: There will be other missing shortcut overloads for other endpoints as well that will be added as they are identified.TermsInclude
type.Properties.Add
method key is generated as a string rather than PropertyName
. Code-generation will be reviewed and fixed.dictionary_of
is generated as an empty class (MappingResponse
is an example) which is unusable. Ensure these generate an appropriate type.QueryStringQuery.Fields
is generated as IEnumerable<Field>
but should be Fields
.This response hopefully offers some clarification on the use cases we are primarily aiming to serve with this library, as well as some general advice for others. Thank you again for your constructive feedback - it definitely helps us to move in the right direction. Since roadmapping is now complete for 8.0 we will be closing this issue. Feedback and problems found within RC2 and beyond should be opened in dedicated issues so they can be prioritised and addressed individually.
This trade-off helps us to avoid bloating the assembly, which in turn lets us make releases in a more timely fashion.
A few bytes is not bloat and you are code generating most of this so it ain't really costin' you nothin in maintenance to help us out a bit here with a few simple abstractions. Its not like we are asking for something outrageous, just that pragmatism and the real value that comes with it doesnt lose out to the imagined value of the ideas of perfectionism or purism. This library has always been a pain to use for a dotnet developer, and you just make it that much harder for us to use. Its already at the point where I need to use the LowLevel client just to get things to work (they are the only things that match any documentation). It sounds like 8,X is when I will just have to start hand writing the API calls.
While the Microsoft framework design guidelines do not recommend sealing all types by default, doing so seems to be a trend for newer types, based on recent API reviews.
You trust something you are interpreting to be a trend more than established and evidenced advice?
Thank you again for your constructive feedback - it definitely helps us to move in the right direction
Which direction is that, exactly? It doesnt seem to be the one that costs you next to nothing and saves your users a lot of time and money.
TBH, it really sounds to be like you need a Red Team.
EDIT: @technige @stevejgordon - Also of note, Jon Skeet (whom you cite above) also wishes for interfaces to use for testing (from https://stackoverflow.com/a/6389669/16391)
v8.0.0 .NET Client Roadmap
On August 10th, we released Elasticsearch 8.0.0-alpha1, and we wanted to provide this high-level roadmap of what we are planning for the v8.0.0 client, laying the groundwork for its future. We’ve been working towards the next client release for several months, with some elements well underway. We’re excited to begin sharing our vision in this issue. A major release is an excellent opportunity to reflect on any existing design limitations and introduce improvements that may be difficult or impossible without breaking changes. This document highlights important details about how we plan to maintain the client in the future. These drive some of our decisions for this release.
NOTE: This roadmap outlines ideas, concepts and features that we hope to include in the v8.0.0 client release. Some items may change as we investigate them and/or drop them to future releases as appropriate.
Themes
Let’s begin with some of the main themes we’ve identified as key objectives for this release.
User-friendly - The client should be approachable for consumers of all experience levels. The API surface should be reviewed to identify areas for improvement, particularly for common scenarios. Public types and their members should include helpful XML comments which appear in IDE tooling to guide their use. Additional helpers should be introduced, for example, helpers for using Point In Time for optimised and simplified data egress.
Performance - .NET Core and .NET 5+ introduced huge performance improvements within the runtime and Base Class Libraries (BCL). This includes methods and types geared towards reducing allocations, such as Span<T>. The client should leverage these to reduce allocations on hot paths. The client should introduce overloads accepting these types where it can further offer a benefit for improved performance or convenience. Development of the client should continuously seek further performance improvements which consumers can benefit from, simply by upgrading to the latest version.
Best Practices - The client should continue to apply Microsoft best practices around API design. This includes ensuring that its design aligns well with the latest API design standards used by Microsoft themselves. The client should also guide consumers to leverage the latest best practices in Elasticsearch by preferring more optimal APIs where applicable. For example, favouring Point In Time APIs over the deprecated Scroll APIs.
Efficient to maintain - This entry may be surprising as it appears less user-focused at first glance, but bear with us. Elasticsearch introduces many great new features in each minor release. These features add new endpoints and expand requests and responses for existing APIs. For the low-level client, we automatically generate code to support these APIs on day one. For NEST, the high-level client, we must manually maintain many types to implement the strongly-typed support. This requires significant engineering time. With the next release, we are aiming to reduce this overhead (see below “Code Generation” section for further details). Removing this overhead creates more time to work on value-add features such as helpers, performance, and documentation.
Diagnostics - The client should make diagnosing issues as simple as possible. We already have excellent diagnostics in the form of audit trails, debug information and
DiagnosticSource
events. These can be configured to understand the causes of any problems. The next client version should build on this foundation to further improve the diagnostics story.Documentation - The documentation should be clear and guide consumers toward the path of success when using the client. It should include more detail for common scenarios and include recommendations for best practice usage. Where we have frequently asked questions, the documentation should be expanded to address those more clearly. We want consumers of the client library and Elasticsearch to be productive with as little friction as possible.
Primary Changes
Before we get to the roadmap items, it’s worth calling out a few core changes we are planning, which influence design decisions and some of the items which appear on the roadmap.
Code Generation
As indicated in the “Efficient to maintain” theme above, manually maintaining request/response types requires much engineer time. It can also introduce a lag for the implementation of some APIs in the high-level NEST client. It’s a predominantly manual process, and as such, things can be missed.
The Elastic language clients team are excited to be working on a type specification internally, which will provide a fantastic resource to document the endpoints of Elasticsearch. This includes defining representations of the requests and responses, and all subtypes needed to (de)serialise API request and response bodies. We have a fantastic opportunity to leverage this specification by building advanced code-generators for our clients.
This work is already underway for .NET. We have a generator prototype that uses the Roslyn APIs to produce far more of the code required within the high-level client. We intend to continue with this work to code generate all Elasticsearch endpoints and their corresponding types within the client. Once this work is complete, new server features will be added via automated PRs using GitHub actions. This is extremely exciting as it ensures timely inclusion and support of new endpoints in Elasticsearch. Code generation also helps ensure all fields on requests and responses are supported and represented accurately. Automation for the win!
A side-effect of code generation is that it may require some type names and namespaces to change. When manually crafting the types, engineers have carefully avoided naming conflicts. The code generator needs to be more generic in its approach and will leverage namespaces to distinguish types from one another. The intent is to try to limit the breaking changes this introduces. As we progress with code generator work, we will have a complete understanding of what this may involve for the consumption of the library and the upgrade process.
System.Text.Json
Currently, the high-level client uses an internalised and modified version of Utf8Json for request and response (de)serialisation. This was introduced for its performance improvements over Json.NET, the more common JSON framework at the time.
While Utf8Json provides good value, we have identified minor bugs and performance issues that have required maintenance over time. Some of these are hard to change without more significant effort. This library is no longer maintained, and any such changes cannot easily be contributed back to the original project.
With .NET Core 3.0, Microsoft shipped new JSON APIs that are part of .NET. Initially, the feature set was quite limited, but each subsequent release of .NET has filled more of the functionality gaps. For v8.0.0, we plan to adopt the System.Text.Json (STJ) APIs for all (de)serialisation. Consumers will still be able to plug in their own serialisation for their document types.
By adopting a Microsoft supported library, we can better depend on and contribute to its maintenance. STJ is designed from the ground up to support the latest performance optimisations in .NET and, as a result, is both fast and low-allocation (de)serialisation. Further work is included for .NET 6, which will continue to optimise serialisation through source generators which we can leverage to gain further performance boosts in our .NET client.
This is a significant piece of work as we require many custom converters for more complex types and JSON structures. Requests and responses, for example, search, include polymorphic properties. We are prototyping these changes in the code generated client with good success so far.
Transport
The .NET client includes a transport layer responsible for abstracting HTTP concepts and to provide functionality such as our request pipeline. This supports round-robin load-balancing of requests to nodes, pinging failed nodes and sniffing the cluster for node roles.
As part of v8.0.0, we are moving this transport layer out into its own dedicated package and repository. This supports reuse across future clients and allows consumers with extreme high-performance requirements to build upon this foundation. We already have the master branch of the existing client repository migrated to this new Transport package.
Before release, we are investigating further enhancements to support other scenarios and optimise performance. We also plan to ensure that we can implement future HTTP improvements from Microsoft, including a proposed set of lower-level APIs (LLHTTP) for further allocation reductions.
High-level Roadmap
Below you will find some of the core units of work we are undertaking for the next client version. These are roughly broken into stages representing the priorities and dependencies of these items.
Stage 1
IAsyncDisposable
support where applicable.Stage 2
IAsyncEnumerable
support on appropriate APIs and helpers.ValueTask
return types where appropriate for performance gains.Stage 3
Stage 4
Summary
We’re incredibly excited about the work we have begun towards the next version of the .NET client for Elasticsearch. We have a lot of work ahead and will share more as we are nearer a final product. We welcome your feedback and ideas which can help shape the future of the .NET client.