elastic / elasticsearch-rs

Official Elasticsearch Rust Client
https://www.elastic.co/guide/en/elasticsearch/client/rust-api/current/index.html
Apache License 2.0
702 stars 72 forks source link

[ENHANCEMENT] Add BulkBodyBuilder/BulkRequestBuilder? #43

Closed mwilliammyers closed 4 years ago

mwilliammyers commented 4 years ago

The current bulk API is a little raw/not very "Rusty". I understand that it matches the actual bulk REST API and that this library, in general, is meant to be a thin abstraction over the REST API.

That being said, is it within the scope of this library to add something like a BulkDocumentOperation from the elastic crate—used like this? It would abstract over update vs. index vs. delete etc. and expose options for each like setting source: true for update to return the updated doc (which is available on elastic-rs/elastic#master).

russcam commented 4 years ago

I knew it'd only be a matter of time before request types came up 😄.

I'll open an issue to discuss request/response types more broadly, but I think for an extremely common API like bulk, it'd make sense to have something like BulkDocumentOperation to work with.

Perhaps a BulkOperation struct, similar to BulkDocumentOperation, that implements the Body trait, and can accept a type that implements Serialize for those operations that support sending a document?

mwilliammyers commented 4 years ago

I am glad I could be the first to bring them up 😄.

Yeah that was exactly what I was thinking.

russcam commented 4 years ago

I've taken an initial go at this in #56.

One thing I'm undecided about is how to support heterogeneous source types across bulk operations. The common use case is bulk API calls where all the source types are homogeneous e.g. all bulk index operations where the source of each is the same type. It would be good however to support the less common use case with heterogeneous types.

I've added a unit test with a possible implementation, where each operation is serialized to an internal buffer when added to a collection of operations:

https://github.com/elastic/elasticsearch-rs/blob/53a9c2e33cdd9e8c04969a4285b7dbb5b2397e37/elasticsearch/src/root/bulk.rs#L567-L580

But perhaps there's a nicer/better/idiomatic way? I looked at erased_serde which I think would make it possible to support Vec<BulkOperation<Box<dyn Serialize>>>, but I suspect there are some perf implications in using it. If there are, I don't think we would want to impact the common use case by blanket using it.

mwilliammyers commented 4 years ago

Yeah its tricky, let me look back at the elastic crate...

Yeah there is definitely a theoretic performance hit but I think we would want to benchmark to prove it before we make things less ergonomic.

I need to look at this more—but why aren't we considering using an enum? I must be missing something 🙃 Just to use the builder pattern?

mwilliammyers commented 4 years ago

By the way thanks for the PR! I am speaking at a conference in a few weeks, otherwise I would have tried taking a stab at it myself.