elastic / elasticsearch-rs

Official Elasticsearch Rust Client
https://www.elastic.co/guide/en/elasticsearch/client/rust-api/current/index.html
Apache License 2.0
702 stars 72 forks source link

[BUG] error decoding response body: missing field `<field>` at line x column y #70

Closed DevQps closed 4 years ago

DevQps commented 4 years ago

Describe the bug It seems that there are some problems with deserializing structs.

*To Reproduce

  1. I created an index and pushed my structs using the following code:

    let response = client
            .index(IndexParts::IndexId(index, id))
            .body(&self)
            .send()
            .await?;
    
        Ok(response.status_code().is_success())
  2. I can verify in Kibana that the entities are correctly created.
  3. When I try to retrieve them using the following code:
    
    let mut response = client
            .get(GetParts::IndexId(index, id))
            .send()
            .await?;

response.read_body().await?

I receive the error `error decoding response body: missing field `<field>` at line x column y`.

The interesting part however is that if I do:

let mut response = client .get(GetParts::IndexId(index, id)) .send() .await?;

    let value: Value = response.read_body().await.unwrap();
    let value =  value.get("_source").unwrap();
    let value: Self = serde_json::from_value(value.clone()).unwrap();
    Ok(value)
It can successfully decode the response.

The struct I use has the following format:

pub struct MyStruct { pub a: String, pub b: String, pub c: Vec<HashMap<String, String>>, pub d: u64, }


The error stated that it was unable to find missing field `a`.

EDIT: As a bonus I printed out the `value` from the second (working) example, and the JSON I printed contained all the parameters of MyStruct. 

**Expected behavior**
Expected `response.read_body()` to successfully deserialize the response.

**Environment (please complete the following information):**
 - OS: Windows 10 Pro
 - rustc 1.41.1 (f3e1a954d 2020-02-24)
DevQps commented 4 years ago

I really enjoy the API so far btw! Thanks a lot for putting so much effort into it! It saved me a lot of time and effort :)

russcam commented 4 years ago

I really enjoy the API so far btw! Thanks a lot for putting so much effort into it! It saved me a lot of time and effort :)

Thanks 😄

In step 3,

response.read_body().await?

do you specify a type to deserialize into? Something like

let index = "foo";
let id = "bar";

let response = client
    .get(GetParts::IndexId(index, id))
    .send()
    .await?;

let get_response = response.read_body::<Value>().await?;
let source: MyStruct = serde_json::from_value(get_response["_source"].clone())?;

#[derive(serde::Deserialize)]
pub struct MyStruct
{
    pub a: String,
    pub b: String,
    pub c: Vec<HashMap<String, String>>,
    pub d: u64,
}

I think this would fail with MyStruct if a _source did not have values forall of a,b,c and d fields, because they're all non-optional. Would you want default values in these cases?

If it's just the _source that you're interested in, the source API may be more suitable

let index = "foo";
let id = "bar";

let response = client
    .get_source(GetSourceParts::IndexId(index, id))
    .send()
    .await?;

let source = response.read_body::<MyStruct>().await?;
DevQps commented 4 years ago

Thanks for your reply!:

I use Self as the return type (I implemented it via a trait), so the Self type in this case is MyStruct

The complete function is as follows:

#[async_trait]
impl<T> ElasticOperations for T where T: ...
{
    /// Pulls one DNS record from Elasticsearch.
    async fn pull(client: &Elasticsearch, index: &str, id: &str) -> Result<Option<Self>, Error>
    {
        let response = client
            .get(GetParts::IndexId(index, id))
            .send()
            .await?;

        // Last part
        let value: Value = response.read_body().await?;
        let value = match value.get("_source") {
            Some(x) => x,
            None => return Ok(None),
        };

        Ok(serde_json::from_value(value.clone())?)
    }
}

The code listed above works correctly. However if I change the last part to:

if response.status_code().is_success() {
    match response.read_body::<Self>().await {
        Ok(x) => return Ok(Some(x)),
        Err(e) => return Err(e),
    };
} else {
    return Ok(None);
}

I get errors that the first property of the struct is not present, while I am sure that the data is there (can be viewed in Kibana as well). None of the fields in MyStruct are optional however, they are also all present in Kibana. I double checked this.

Mmm might there be something I haven't thought about?

russcam commented 4 years ago

The following two calls are not the same

// Last part
let value: Value = response.read_body().await?;
let value = match value.get("_source") {
    Some(x) => x,
    None => return Ok(None),
};

Ok(serde_json::from_value(value.clone())?)

and

if response.status_code().is_success() {
    match response.read_body::<Self>().await {
        Ok(x) => return Ok(Some(x)),
        Err(e) => return Err(e),
    };
} else {
    return Ok(None);
}

In the former, the entire response body is deserialized to Value and then MyStruct is deserialized from the Value at the property "_source", whereas the latter tries to deserialize the entire response body to MyStruct. The former is correct and the latter is not correct as a response to the GET API looks like the following in JSON (an example)

{
    "_index" : "twitter",
    "_type" : "_doc",
    "_id" : "0",
    "_version" : 1,
    "_seq_no" : 10,
    "_primary_term" : 1,
    "found": true,
    "_source" : {
        "user" : "kimchy",
        "date" : "2009-11-15T14:12:12",
        "likes": 0,
        "message" : "trying out Elasticsearch"
    }
}
DevQps commented 4 years ago

Ahhh that makes everything clear! I was under the impression the read_body would automatically use the _source value to deserialize the struct. Thanks for this explanation! I will close this issue now since it was simply a misunderstanding.

russcam commented 4 years ago

No worries @DevQps! Just a heads up, looking to rename read_body() in #74 to just json()