jgaskins / elasticsearch

Elasticsearch client for Crystal
https://jgaskins.dev/elasticsearch/
MIT License
4 stars 1 forks source link

Documentation improvements #1

Open descholar-ceo opened 1 year ago

descholar-ceo commented 1 year ago

I have been struggling to understand a way to use this package, even though it looks great, I tried to delve into it, but it feels like it's very hard to know how to without documentation, I can help write some of them, but first, how can I master the way the package itself works?

Just giving a straight example, I missed a correct way to pass this query to the search method:

{
  "query": {
    "bool": {
      "must": [
        {"wildcard": { "first_name": "*em*"}},
        {"match": { "is_active": true }},
        {"range": {
          "created_at": {
            "gte": "2023-09-14T00:00:00",
            "lte": "2023-10-14T23:59:59"
          }
        }
        }
      ]
    }
  }
}

I tried something like this:

found_matches = @@es.search("users", query: {
        bool: {
          must: [
            {wildcard: {first_name: "*em*"}},
            {match: {is_active: true}},
            {range: {
              created_at: {
                gte: "2023-09-14T00:00:00",
                lte: "2023-10-14T23:59:59",
              },
            }},
          ],
        },
      })

But the compiler was not happy with this, it threw this message:

Error: no overload matches 'Elasticsearch::Client#search' with types String, query: NamedTuple(bool: NamedTuple(must: Array(NamedTuple(match: NamedTuple(is_active: Bool)) | NamedTuple(range: NamedTuple(created_at: NamedTuple(gte: String, lte: String))) | NamedTuple(wildcard: NamedTuple(first_name: String)))))

Overloads are:
    - Elasticsearch::Client#search(index_name : String | Enumerable(String), query, *, as type : T.class, from : Int | ::Nil = nil, fields : Array(String) | ::Nil = nil, size : Int | ::Nil = nil, source : String | Bool | Nil = nil, aggregations = nil, sort = nil, track_scores = nil, profile = nil) forall T
    - Elasticsearch::Client#search(index_name : String | Enumerable(String), *, match_all, fields = nil, from = nil, sort = nil, size = nil, aggregations = nil, profile = nil, as type : T.class = JSON::Any) forall T
    - Elasticsearch::Client#search(index_name : String | Enumerable(String), *, simple_query_string query : String, default_operator = nil, analyzer = nil, fields : Array(String) | ::Nil = nil, aggregations = nil, from = nil, profile = nil, size = nil, sort = nil, source = nil, as type : T.class = JSON::Any) forall T
    - Elasticsearch::Client#search(index_name : String | Enumerable(String), *, query_string query : String, query_string_options = NamedTuple.new, aggregations = nil, fields = nil, from = nil, profile = nil, size = nil, sort = nil, source = nil, as type : T.class = JSON::Any) forall T

I am wondering what's the ideal type of the query that the search method expects?

jgaskins commented 1 year ago

I think you're missing the as keyword argument so the client knows how to deserialize your results. Some of the overloads default to deserializing the result source values as JSON::Any, but it looks like it's missing from the one you're using. Does this work?

found_matches = @@es.search("users", query: your_query, as: JSON::Any)

If you have some JSON::Serializable types, you can use them in place of JSON::Any.

The query object itself is untyped until I can figure out a decent data structure for them to make them type-safe — ideally, since ES queries can be extremely complex, we would be able to provide that feedback at compile time, but for right now I believe you can pass in any object that responds to to_json and it will compile. You'll just get query errors at runtime until we have the right data structures in place.

jgaskins commented 1 year ago

it's very hard to know how to without documentation, I can help write some of them, but first, how can I master the way the package itself works?

I haven't written as much documentation as I'd like, largely because I hadn't expected anyone else to use this shard yet. 😄 The docs I do have are available here.

If you'd like to contribute more documentation, the TL;DR, is that I am trying to keep the APIs to match the naming the docs use, so for example, the documents index API is called as es.docs.index (only abbreviated because documents takes me too long to type). The return type of es.docs is Elasticsearch::Documents.

descholar-ceo commented 1 year ago

I think you're missing the as keyword argument so the client knows how to deserialize your results. Some of the overloads default to deserializing the result source values as JSON::Any, but it looks like it's missing from the one you're using. Does this work?

found_matches = @@es.search("users", query: your_query, as: JSON::Any)

If you have some JSON::Serializable types, you can use them in place of JSON::Any.

The query object itself is untyped until I can figure out a decent data structure for them to make them type-safe — ideally, since ES queries can be extremely complex, we would be able to provide that feedback at compile time, but for right now I believe you can pass in any object that responds to to_json and it will compile. You'll just get query errors at runtime until we have the right data structures in place.

Adding the as_type worked, however, what about passing sorting? I tried passing it like this: sort: [{ created_at: { order: 'desc' } }], and this could not work as well.

Another difficult time I had was to be able to customize the elasticsearch response itself for example excluding some fields to be returned from the response, I needed something like this _source: { excludes: ['tags', '@version', '@timestamp'] },

jgaskins commented 1 year ago

Adding the as_type worked, however, what about passing sorting? I tried passing it like this: sort: [{ created_at: { order: 'desc' } }], and this could not work as well.

You can pass an ES::Sort as the sort argument. IIRC Elasticsearch doesn’t accept arrays there.

es.search index, query,
  sort: ES::Sort{"created_at" => "desc"}

Another difficult time I had was to be able to customize the elasticsearch response itself for example excluding some fields to be returned from the response, I needed something like this _source: { excludes: ['tags', '@version', '@timestamp'] },

I don’t think that’s implemented yet.

descholar-ceo commented 1 year ago

Adding the as_type worked, however, what about passing sorting? I tried passing it like this: sort: [{ created_at: { order: 'desc' } }], and this could not work as well.

You can pass an ES::Sort as the sort argument. IIRC Elasticsearch doesn’t accept arrays there.

es.search index, query,
  sort: ES::Sort{"created_at" => "desc"}

Another difficult time I had was to be able to customize the elasticsearch response itself for example excluding some fields to be returned from the response, I needed something like this _source: { excludes: ['tags', '@version', '@timestamp'] },

I don’t think that’s implemented yet.

Yeah, thank you that sort worked as well, thank you.

About documentation, I have an idea, for a very quick recap, we can just create kind of example mini projects by adding a new directory called examples, showcasing how the core methods are used and how arguments are structured, let me know what you think about that, so then good and well written documentation can come as we go.

jgaskins commented 1 year ago

About documentation, I have an idea, for a very quick recap, we can just create kind of example mini projects by adding a new directory called examples, showcasing how the core methods are used and how arguments are structured

This is what I do with all my shards and it is a common convention in Crystal shards. I have several examples locally for this one, but they remain uncommitted while I fleshed out the structure of the shard.

As mentioned in the README, this shard is known to be rough around the edges still. Contributions are welcome.