elastic / elasticsearch-net

This strongly-typed, client library enables working with Elasticsearch. It is the official client maintained and supported by Elastic.
https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/index.html
Apache License 2.0
3.57k stars 1.15k forks source link

Serialization of nested objects #3186

Closed ayonix closed 6 years ago

ayonix commented 6 years ago

NEST/Elasticsearch.Net version: 6.0.2

Elasticsearch version: 6.2.3

Description of the problem including expected versus actual behavior: NEST serializes the nested object in a wrong way and thus indexes incorrect values. Nested Values get serialized as empty Arrays instead of strings, serializing the same object with Newtonsoft.Json 11 yields the expected JSON representation.

Steps to reproduce:

  1. Use NEST to index a nested object
  2. Check the call/indexed document

Provide ConnectionSettings (if relevant): Standard ElasticClient with some default index

Provide DebugInformation (if relevant):

Debuginfo
Valid NEST response built from a successful low level call on PUT: /indexwriter/Type1/b5554a0f-42dd-42f2-91fe-e68127c1f486
# Audit trail of this API call:
 - [1] PingSuccess: Node: http://localhost:9200/ Took: 00:00:00
 - [2] HealthyResponse: Node: http://localhost:9200/ Took: 00:00:00.0580000
# Request:
{"Children":[{"Id":[],"Name":[],"Property1":[],"Nested1":[{"Name":[],"Value":[]},{"Name":[],"Value":[]}]},{"Id":[],"Name":[],"Property1":[],"Nested1":[{"Name":[],"Value":[]},{"Name":[],"Value":[]}]}]}
# Response:
{"_index":"index-2018.04.05-000001","_type":"Type1","_id":"b5554a0f-42dd-42f2-91fe-e68127c1f486","_version":1,"result":"created","_shards":{"total":1,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}

JsonConvert.SerializeObject(entry):

{
    "Prop1": null,
    "Children": [
        {
            "Id": 0,
            "Name": "e2b01ee9-c4cd-491a-910b-5ce139caaa91",
            "Property1": true,
            "Nested1": [
                {
                    "Name": "9163eaf5-0268-46c4-8bcb-f0077c96d478",
                    "Value": "cfc13220-87ef-4c6e-832d-029bd7d5e4e6"
                },
                {
                    "Name": "4f564932-89f5-46f2-95e5-f9cfb427803a",
                    "Value": "2c10d357-e587-442c-a597-3900ce8732a6"
                }
            ]
        },
        {
            "Id": 0,
            "Name": "eae6a6bf-962c-4d07-8ca3-61d2fa1990bd",
            "Property1": true,
            "Nested1": [
                {
                    "Name": "30102062-380a-43b1-91fc-3de253f571c7",
                    "Value": "de48ca37-6b28-4a99-8993-669b430728c9"
                },
                {
                    "Name": "6ec8d226-30af-45d6-86b1-4d08b2315ef4",
                    "Value": "c3c93525-7e72-48a3-97fd-a24e586e5c12"
                }
            ]
        }
    ]
}

The Mapping

{
        "template": "*",
    "mappings": {
        "Type1": {
            "_all": {
                "enabled": "false"
            },
            "dynamic_templates": [{
                "strings_keyword": {
                    "match_mapping_type": "string",
                    "mapping": {
                        "type": "keyword",
                        "index": "true"
                    }
                }
            }]
        }
    }
}
russcam commented 6 years ago

@ayonix would you be able to provide a small but complete example e.g. the classes that you're serializing, creating the index and indexing some sample documents?

ayonix commented 6 years ago

https://gist.github.com/ayonix/ec8555c6e96ee4a05dc44c6c956ca92d

The problem seems to be related to deserializing with Newtonsoft.Json, not nested objects. Indexing the object directly works, but after deserializing it indexes properties as empty arrays.

russcam commented 6 years ago

Thanks for the example @ayonix 👍 OK, I can see what the issue is.

object doc = JsonConvert.DeserializeObject<object>(docString);

with Newtonsoft.Json, this actually returns a Newtonsoft.Json.Linq.JObject.

In versions of NEST prior to 6.x, this would have worked fine because NEST has a dependency on Newtonsoft.Json for serialization, and it knows how to handle Newtonsoft.Json.Linq.JObject in a special way.

In NEST 6.x+ however, the dependency on Newtonsoft.Json is IL-merged, internalized and re-namespaced to Nest.Json. When it comes to handling Newtonsoft.Json.Linq.JObject in a special way however, it does not know how to because the internal serializer knows nothing about types in the Newtonsoft.Json namespace.

If you'd like to keep using Newtonsoft.Json as the serializer for your documents, you can install the NEST.JsonNetSerializer nuget package and hook up JsonNetSerializer as the serializer to use

var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var connectionSettings =
    new ConnectionSettings(pool, sourceSerializer: JsonNetSerializer.Default);
var client = new ElasticClient(connectionSettings);

Your example will then serialize as expected.

johnrom commented 2 years ago

When it comes to handling Newtonsoft.Json.Linq.JObject in a special way however, it does not know how to because the internal serializer knows nothing about types in the Newtonsoft.Json namespace.

Should this actually be fixed in the "special way?" I ran into this same issue as well with a nested object's array field deserialization. Is it at least documented somewhere that users should switch serializers when using nested objects and array fields?