elastic / elasticsearch-rs

Official Elasticsearch Rust Client
https://www.elastic.co/guide/en/elasticsearch/client/rust-api/current/index.html
Apache License 2.0
702 stars 72 forks source link

BulkOperation builder implementations #56

Closed russcam closed 4 years ago

russcam commented 4 years ago

This commit introduces a collection of builders to help with the bulk API. The bulk API accepts a collection of operations in newline delimited JSON format, where each operation is represented as a header, followed by an optional source for the operation. The builders provide functions to build each of the supported operations, exposing the metadata options available for the operation. Operations can be converted to a common Operation type which implements the Body trait.

Closes #43

mwilliammyers commented 4 years ago

This looks like the direction we should head.

I have a few initial thoughts:

russcam commented 4 years ago

What about adding params like _source etc?

The query string params are covered on Bulk, the builder for the API call. for example

https://github.com/elastic/elasticsearch-rs/blob/9e7468a5447402bd74c364453ba3dd1cf5ffe3b1/elasticsearch/src/generated/root.rs#L115-L119

I think we should add an extend method that is generic over IntoIterator or Iterator. It was very handy to have.

Looks like this would logically go onto BulkOperations, and change the signature of body() on Bulk to accept BulkOperations as opposed to Vec<T> where T: Serialize (and change the generic type parameter on the return type from NdBody<T>...

russcam commented 4 years ago

In https://github.com/elastic/elasticsearch-rs/pull/56/commits/48f37ca0d10e85046ebcfdbeb439e0e8fa0a6818, I've changed the &'a str fields on bulk operations to use String, with functions accepting Into<String. The usage of &'a str makes it awkward to work with when constructing operations inside a loop, for example

https://github.com/elastic/elasticsearch-rs/blob/48f37ca0d10e85046ebcfdbeb439e0e8fa0a6818/elasticsearch/tests/common/client.rs#L83-L86

I did consider using Cow<str> for these, but thought Into<String> might be a little easier to work with. What are your thoughts, @mwilliammyers?

mwilliammyers commented 4 years ago

Yeah, I vote for Into<String>, it is the most flexible and let's the caller decide what type to use.

We can revisit and use Cow<str> or &'a str—or something else—iff performance/no-std compatibility becomes an issue (I highly doubt it ever will).

russcam commented 4 years ago

Added BulkOperations from the test to the package.

I think this PR is good to merge in for now. There's a couple of things that I think should be looked at after, such as Bulk accepting Vec<T> where T:Body for its body fn, which means wrapping BulkOperations in a Vec to be able to pass it, but I think this is OK for the moment.

mwilliammyers commented 4 years ago

What about:

where I: IntoIterator<Item=T>, T: Body

or

where T: IntoIterator, T::Item: Body

Not sure which I like better 🙃