elastic / elasticsearch-net

This strongly-typed, client library enables working with Elasticsearch. It is the official client maintained and supported by Elastic.
https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/index.html
Apache License 2.0
3.54k stars 1.14k forks source link

The API doesn't type #4196

Closed haf-afa closed 3 years ago

haf-afa commented 4 years ago
  1. The API doesn't type, I have to use op_Implicit in a bunch of places when I declare index names.
  2. The API doesn't type, all the primary functions to actually index stuff take a Func<SomethingDescriptor<'TDocument>, ISomethingRequest> but the builder doesn't have a Finish() or Build() method that converts the descriptor to an actual request. When searching types in this repository there's a bunch of empty stubs, such as IIndexRequest and IndexRequest but with no implementations and no docs and there are no current docs on the homepage nor in the readme and all examples use the broken invocation. I presume there's a C# implicit conversion somewhere from the descriptor to the request interface, but how to call it is completely undocumented. :(
  3. When adding first a property, app and then app.kubernetes.io, I would expect them to map to two mappings for two properties, but ES interprets it as one and the same, erroring out with: "org.elasticsearch.index.mapper.MapperParsingException: Could not dynamically add mapping for field [app.kubernetes.io/instance]. Existing mapping for [labels.app] must be of type object but found [text]." (labels is the parent object) This could be type-safely done, perhaps? Because if the call types, it should work.

Example for 2 that I haven't figured out yet:

  let createBuilder (x: CreateIndexDescriptor): ICreateIndexRequest =
    x.Map<CRD>(fun x -> x.AutoMap().Properties(fun y -> y) :> IPromise<_>) :> ICreateIndexRequest
// doesn't work

Workaround for 2:

let indexDocument (client: ElasticClient) (doc: CRD) =
  client.IndexAsync<CRD>(doc, fun i -> i.Id(doc.getId() |> Id.op_Implicit) :> IIndexRequest<_>)
    |> Async.AwaitTask

Actionable:

russcam commented 4 years ago
  1. There are implicit conversions from a set of types to a type used by the client. For example,

    https://github.com/elastic/elasticsearch-net/blob/901e60d520ccf938025485555bbc263e9734bf40/src/Nest/CommonAbstractions/Infer/IndexName/IndexName.cs#L81-L83

    IndexName has implicit conversions from string and Type. Using these from F# would currently require using op_Implicit.

  2. All SomethingDescriptor<'TDocument> explicitly implement their counterpart interface, e.g. ISomethingRequest, so there is no terminal build method. Because of this, using the fluent API with F# is going to be somewhat verbose due to needing to explicitly cast to the interface when finishing construction, all the way down the fluent building graph.

    such as IIndexRequest and IndexRequest but with no implementations and no docs and there are no current docs on the homepage nor in the readme and all examples use the broken invocation

    There is a fairly comprehensive suite of integration tests that run on every commit, and examples are also generated from tests, so I don't agree with your assessment here.

    You may want to consider using the object initializer syntax with F#, as I think it may be less onerous to use. There's a recent blog post and accompanying gist that may help.

  3. This is Elasticsearch behaviour: app.kubernetes.io is interpreted as the path within an object hierarchy, equivalent to if you'd sent:

    "app": {
        "kubernetes": {
            "io": "some value"
        }    
    }

    There's some history to this (e.g. https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html and https://github.com/elastic/elasticsearch/pull/19937), but this is the behaviour. Elastic Common Schema may be of interest to you: https://www.elastic.co/guide/en/ecs/current/ecs-reference.html.

Create a few F# examples to ensure it's not so utterly frustrating to start using in a language that is not C#

We have discussed several times in the past about better supporting F#, either through some wrapper APIs, similar to what FSharp.Akka does for Akka, or through a type provider. It is low priority for us however. Having written many client examples with F# in the past, I would agree that the API is more awkward to use with F# than C#, less so with the object initializer syntax though.

Instead of overloading functions, and especially so if you have generic parameters in them, create two functions with clear and distinct names reflecting their usage, so I can choose whether to use a builder or construct the XXXRequest object outside of the call to the function.

I'm not sure I understand this point. You can already construct requests outside of the call to a function, using a descriptor (builder) or an XXXRequest. The method overloading approach supporting the two different syntaxes generally works well, where the method names map to the Elasticsearch APIs, over 270 of them, and counting.

haf-afa commented 4 years ago

Using these from F# would currently require using op_Implicit.

You don't have to do it that way, just provide an overload and then F# deals with it just fine. The way it is right now doesn't type when you follow the docs. Using implicits instead of overloads adds no extra value.

Elastic Common Schema may be of interest to you: https://www.elastic.co/guide/en/ecs/current/ecs-reference.html.

Nice resource, thanks!

We have discussed several times in the past about better supporting F#, either through some wrapper APIs, similar to what FSharp.Akka does for Akka, or through a type provider.

I think you're over-complicating it then; F# interops very well with C# if the C# is written in a pretty standard way.

Since the client statically depends on the non-interface type, why not just let the callback return that? It's not like callers of your fluent API are going to say "Hey, I'm going to do a Liskov here and replace the instance I got in the callback with my own implementation of ITypeMapping", unless I'm not understanding something?

Doing a type provider for this is a complete drain of resources, I really don't recommend it. You basically want to set properties and do some reflection; improving your API to solve that is easy and doesn't have to take long. Plus, all a type provider would do is to generate the same types whose abstractness we're discussing and the remainder; the client/builder/reflection bits would still have to be redesigned to match more general .Net idioms. The only difference if you go down this route is that you're not working with a compiler any more, but instead generating your code via a reflection-like API -- it's just so much easier to fix the minor niggles of this API than to do a TP.

with the object initializer syntax though.

The problem here is finding the right magic types to use when one doesn't know the API before. The docs don't document how to use the API sufficiently well. You can solve this either by

The method overloading approach supporting the two different syntaxes generally works well, where the method names map to the Elasticsearch APIs, over 270 of them, and counting.

Are you saying you have an API in Elasticsearch's API suffixed with Async? Because if you don't, then you already have renamed methods and my point stands w.r.t this particular reply.

I'm not sure I understand this point. You can already construct requests outside of the call to a function, using a descriptor (builder) or an XXXRequest.

What I mean is that when the programmer uses your API definitions in an IDE, there are two things that confuse the typing in F# in this case;

  1. The constant concrete -> abstract upcasts (and somehow (magically?) knowing that the type you got from the builder factory explicitly implements the interface!)
  2. That the method is overloaded - it makes the error messages much worse and it's harder to derive whether I missed some parameter or if the types of the callback don't type; because if those callback types don't type (which is almost always the case when using your API), the compiler responds with non-hydrated generic parameters in the error messages, like so:

bad-api-experience

And marks (as you can see) everything an error, when the real error is in fun y -> y in this case (amongst things). Also, IPromise<'a>? Is that a future library I spy?

Generally though; why are you making everything interfaces? If you have a request type that is a data bearer I really don't need an interface. Interfaces are called "protocols" in other languages and should define a method by which you can interact with stateful objects; or if you have data bearers, interfaces are nice to use as a tool to compose many aspects together, but having a one-to-one mapping between interface and implementation types is just bike shedding to appease some mocking framework most of the time.

russcam commented 4 years ago

You don't have to do it that way, just provide an overload and then F# deals with it just fine.

Are you proposing that .Field(...) is overloaded to accept string, everywhere that it is used?

The way it is right now doesn't type when you follow the docs. Using implicits instead of overloads adds no extra value.

This may be true for F#, but is not true for C#, the language the docs are written in. Using implicit conversions does add value in that it allows a string to be passed anywhere a Field is accepted, without needing to define overloads to also accept string.

Since the client statically depends on the non-interface type, why not just let the callback return that? It's not like callers of your fluent API are going to say "Hey, I'm going to do a Liskov here and replace the instance I got in the callback with my own implementation of ITypeMapping", unless I'm not understanding something?

There may be a reason why the signatures are Func<*Descriptor, I*Request> rather than Func<*Descriptor, *Descriptor>, which would require some investigation. I'll add this as a topic to discuss for the next major version.

Generally though; why are you making everything interfaces? If you have a request type that is a data bearer I really don't need an interface. Interfaces are called "protocols" in other languages and should define a method by which you can interact with stateful objects; or if you have data bearers, interfaces are nice to use as a tool to compose many aspects together, but having a one-to-one mapping between interface and implementation types is just bike shedding to appease some mocking framework most of the time.

There are two implementations for each I*Request interface:

  1. *Descriptor that explicitly implements the interface, providing a fluent builder pattern to build a request, which is referred to as the fluent syntax/API. C# and VB.NET will implicitly cast the *Descriptor to the interface, aligning with the Func<*Descriptor, I*Request> signature.
  2. *Request that implicitly implements the interface, allowing a request to be composed by building up an object graph, referred to as the object initializer syntax/API.

Are you saying you have an API in Elasticsearch's API suffixed with Async? Because if you don't, then you already have renamed methods and my point stands w.r.t this particular reply.

I think this is muddying the waters of the point. Since there is no method overloading varying on return type, async methods by nature will be differently named to their sync counterparts. The general practice when both sync and async methods are provided is also to suffix async methods with Async, which I'm sure you know 🙂

The constant concrete -> abstract upcasts (and somehow (magically?) knowing that the type you got from the builder factory explicitly implements the interface!)

The vast majority of users, using C# and VB.NET, don't need to know this. I would personally prefer that F# would also implicitly handle this, specifically for interop scenarios like this.

And marks (as you can see) everything an error, when the real error is in fun y -> y in this case (amongst things). Also, IPromise<'a>? Is that a future library I spy?

This is something that should be addressed for the next major version, as the implicit casting in some cases where IPromise<T> is used is problematic with F#. The following works

let createIndexResponse = 
    client.Indices.Create(index "kubernetes-resources", fun c -> 
        c.Map<Doc>(fun m ->
            m.AutoMap()
             .Properties(fun y -> y :> IPromise<_>) 
            :> ITypeMapping) 
        :> ICreateIndexRequest)

IPromise<T> is used in cases where repeated calls to some method need to be collected to return a a final value, IIRC.

As mentioned above, where we can improve the API to better interoperate with F# is something we will discuss for the next major version.

haf-afa commented 4 years ago

Using implicit conversions does add value in that it allows a string to be passed anywhere a Field is accepted, without needing to define overloads to also accept string.

Right, but it's auto-generated code, isn't it?

There are two implementations for each I*Request interface:

What I have a hard time understanding, (which you said you'd investigate) is how writing properties to the explicit interface implementation "knows" when it should complete the Request value, and therefore how that is different from writing to the object graph API directly. "Originally" the pattern of having a callback called, would be to keep the pre- and post- logic the same but varying the stuff in the middle, but if the Descriptors can be upcast then that abstraction might be broken.

I think this is muddying the waters of the point. [About Async suffix]

Well, in F# you don't have exactly the same overload resolution, so having a separate method would help, just like the Async suffix is a pattern in C#. All I'm saying is that if this codebase is auto-generated, accordances can be made without causing any large pain to the other language users. Just like Async is an affordance.

What your original point is, if I'm trying to articulate it as I understand it, is; [we want a one-to-one mapping between REST API endpoints and C# methods, because this is "more beautiful" or "more pure" or "easier to manage" than having more than one] -- is this what your intent?

F# is something we will discuss for the next major version.

Great, and perhaps you can consider some of the quickfixes, too, or else document the object API?

russcam commented 4 years ago

Right, but it's auto-generated code, isn't it?

Sadly, the request and response types are not generated; they are currently all hand written. There have been many discussions over the years to build a complete API specification for all requests and responses, which would make generating them possible, but we are not there yet.

What I have a hard time understanding, (which you said you'd investigate) is how writing properties to the explicit interface implementation "knows" when it should complete the Request value, and therefore how that is different from writing to the object graph API directly. "Originally" the pattern of having a callback called, would be to keep the pre- and post- logic the same but varying the stuff in the middle, but if the Descriptors can be upcast then that abstraction might be broken.

It doesn't need to know as such, because at any point, a **Descriptor is a valid I**Request, because it explicitly implements the interface. Therefore, it doesn't need to build some other instance with an explicit call to a terminal build() method. The explicit interface implementation allows the *Descriptor to have fluent method names that match the names of the explicitly implemented interface properties. As a consumer, you are free to choose whether to use the fluent syntax or object initializer syntax to construct requests.

What your original point is, if I'm trying to articulate it as I understand it, is; [we want a one-to-one mapping between REST API endpoints and C# methods, because this is "more beautiful" or "more pure" or "easier to manage" than having more than one] -- is this what your intent?

We want a one-to-one mapping (+ async variants) between client methods and APIs because we believe this makes it easier for client consumers to use the API. For example, a consumer can use the fluent API to make a synchronous search call

var client = new ElasticClient();

var searchResponse = client.Search<Document>(s => s
    .Index("posts")
    .Query(q => q
        .Match(m => m
            .Field(f => f.Content)
            .Query("some query")
        )
    )
);

or the object initializer syntax

var client = new ElasticClient();

var searchResponse = client.Search<Document>(new SearchRequest<Document>("posts")
{
    Query = new MatchQuery 
    {
        Field = Nest.Infer.Field<Document>(f => f.Content),
        Query = "some query"
    }
});

In both cases, Search<TDocument>(...) aligns with the search API.

Great, and perhaps you can consider some of the quickfixes, too, or else document the object API?

As a rule, we follow semantic versioning, and align at least the major version of the client with the major version of Elasticsearch. Accordingly, binary breaking API changes would only be able to be made in a major release. What do you see as the quickfixes?

haf-afa commented 4 years ago

What do you see as the quickfixes?

See referenced PR for an example

stevejgordon commented 3 years ago

Will open meta-ticket to track improved F# support with v8+