elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.51k stars 24.6k forks source link

"object field starting or ending with a [.] makes object resolution ambiguous" does not appear to be accurate for all inputs #28948

Open danlevy1 opened 6 years ago

danlevy1 commented 6 years ago

Elasticsearch version: 6.2.1

Plugins installed: None

JVM version: 1.8.0_131

OS version: MacOS 10.13

I am running across a few test cases where the following error is not making sense to me.

The first example includes a [.] in front of foo. In this case, there is an error that states that there cannot be a [.] at the beginning of the object field:

Input:

curl -XPUT 'localhost:9200/test/test/1?pretty' -H 'Content-Type: application/json' -d' {
    ".foo" : {
        "foo" : {
            "bar" : 123 
        }
    } 
}
'

Output (Error):

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "object field starting or ending with a [.] makes object resolution ambiguous: [.foo]"
    }
  },
  "status" : 400
}

This makes sense according to the error.

The second example includes a [.] after foo. In this case, there is no error.

Input:

curl -XPUT 'localhost:9200/test/test/1?pretty' -H 'Content-Type: application/json' -d' {
    "foo." : {
        "foo" : {
            "bar" : 123 
        }
    } 
}
'

Output (Success):

{
  "_index" : "test",
  "_type" : "test",
  "_id" : "1",
  "_version" : 9,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 8,
  "_primary_term" : 1
}

Get (The [.] is included in the field name):

curl -XGET 'localhost:9200/test/test/1'
{ "_index":"test","_type":"test","_id":"1","_version":9,"found":true,"_source": {
"foo." : {
    "foo" : {
        "bar" : 123 
    }
} 
}

I was looking at the following GitHub / Elasticsearch Discuss Posts 1) https://discuss.elastic.co/t/name-cannot-be-empty-string-error-if-field-name-starts-with/66808 2) https://github.com/elastic/elasticsearch/issues/15951

Based on these, it seems like the second example should be throwing the same error as the first example.

jkakavas commented 6 years ago

I am running across a few test cases where the following error is not making sense to me.

If I understand this correctly, it is the lack of error in the second case that doesn't make sense to you, correct ? Dots are allowed in field names since 2.4, why would you think the second example should throw an error ?

danlevy1 commented 6 years ago

@jkakavas I am basing this off of https://github.com/elastic/elasticsearch/issues/15951, where it states in the first line, "we had to reject field names containing dots."

rjernst commented 6 years ago

@danlevy1 You are correct, this is a bug. The problem is in DocumentParser.splitAndValidatePath. Here we call String.split, and assume that if any element is an empty string, then there was a double dot. However, it does not actually work for trailing dot, due to oddities in the implementation of String.split. From the javadocs there:

Trailing empty strings are therefore not included in the resulting array.

So the algorithm in splitAndValidatePath needs to be amended. Additionally, it seems we are not actually calling that method with the full path all the time, as your example error only contained .foo when it should have been .foo.foo.bar.

elasticmachine commented 6 years ago

Pinging @elastic/es-search-aggs

jkakavas commented 6 years ago

#15951 is 2 years and explicitly discusses options for re enabling support for dots in field names. It was closed when this was introduced in 2.4 as I linked above. I'm going to close this issue for now as this is expected behavior.

Jumped the gun on this one, thanks @rjernst for chiming in.

danlevy1 commented 6 years ago

@rjernst I will take look more into this. I was actually fixing issue https://github.com/elastic/elasticsearch/issues/21862 when I came across this one. I will update this as soon as I take a look at the splitAndValidatePath method. I will hold off on posting the fix on issue https://github.com/elastic/elasticsearch/issues/21862 until I can fix this bug. It's possible the two are intertwined.

danlevy1 commented 6 years ago

@rjernst Just to confirm, the second example I provided in the bug report should throw the same error that the first example throws. Is this correct?

rjernst commented 6 years ago

Yes, that is correct.

jpountz commented 6 years ago

cc @elastic/es-search-aggs

danlevy1 commented 6 years ago

@rjernst I have been playing around with different test cases and the method splitAndValidatePath seems to take in the string in parts. In other words, it never takes in the full field path (.foo.foo.bar) in my first example. Do you have a test case where splitAndValidatePath is called with the full field path? It seems to me like this is more of a systemic problem.

At this point, I can treat the splitAndValidatePath method as a method that only takes in one part at a time if that's what you feel is best. Do you have any thoughts?

rjernst commented 6 years ago

splitAndValidatePath takes in each element within the json structure. For example, consider following json document:

{
    "foo.bar" : {
        "baz" : 1
    }
}

In this example, we would call splitAndValidatePath("foo.bar"), followed by splitAndValidatePath("baz"). The first call would produce 2 elements, which would cause us to lookup (and possibly dynamically create as an object field) a field in the mappings called foo. The intent of the method is to split these field names in json that may contain path separators (ie dot), so that we can mimic the above as if it was specified as:

{
    "foo" : {
        "bar" : {
            "baz" : 1
        }
    }
}
danlevy1 commented 6 years ago

After running ./gradlew check on my fixed code, I am getting two errors.

  1. org.elasticsearch.index.mapper.DocumentParserTests.testDynamicFieldsEmptyName
  2. org.elasticsearch.index.mapper.DocumentParserTests.testDynamicFieldsStartingAndEndingWithDot

For the first error, I ran these two tests to see why the error is occurring:

Test 1 Input:

curl -XPUT 'localhost:9200/test/test/1?pretty' -H 'Content-Type: application/json' -d' {
    " " : {
        "bar" : {
            "baz" : 123 
        }
    } 
}
'

Test 1 Output:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "object field cannot contain only whitespace: [' .bar']"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "object field cannot contain only whitespace: [' .bar']"
  },
  "status" : 400
}

Test 2 Input:

curl -XPUT 'localhost:9200/test/test/1?pretty' -H 'Content-Type: application/json' -d' {
    "foo" : {
        "bar" : {
            " " : 123 
        }
    } 
}
'

Test 2 Output:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "object field cannot contain only whitespace: ['foo.bar. ']"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "object field cannot contain only whitespace: ['foo.bar. ']"
  },
  "status" : 400
}

When I looked at DocumentParserTests.testDynamicFieldsEmptyName the error looks the same as mine. I am not sure why this test is failing.

For the second error, I believe the test is failing because my fix changes the way [.]'s are handled. Here are four test cases:

Test 1 Input:

curl -XPUT 'localhost:9200/test/test/1?pretty' -H 'Content-Type: application/json' -d' {
    ".foo" : {
        "bar" : {
            "baz" : 123 
        }
    } 
}
'

Test 1 Output:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "object field starting or ending with a [.] makes object resolution ambiguous: ['.foo']"
    }
  },
  "status" : 400
}

Test 2 Input:

curl -XPUT 'localhost:9200/test/test/1?pretty' -H 'Content-Type: application/json' -d' {
    "foo." : {
        "bar" : {
            "baz" : 123 
        }
    } 
}
'

Test 2 Output:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "object field starting or ending with a [.] makes object resolution ambiguous: ['foo.']"
    }
  },
  "status" : 400
}

Test 3 Input:

curl -XPUT 'localhost:9200/test/test/1?pretty' -H 'Content-Type: application/json' -d' {
    "fooOne..fooTwo" : {
        "bar" : {
            "baz" : 123 
        }
    } 
}
'

Test 3 Output:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "object field starting or ending with a [.] makes object resolution ambiguous: [fooOne..fooTwo]"
    }
  },
  "status" : 400
}

Test 4 Input:

curl -XPUT 'localhost:9200/test/test/1?pretty' -H 'Content-Type: application/json' -d' {
    "fooOne.fooTwo." : {
        "bar" : {
            "baz" : 123 
        }
    } 
}
'

Test 4 Output:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "object field starting or ending with a [.] makes object resolution ambiguous: ['fooOne.fooTwo.']"
    }
  },
  "status" : 400
}

When I looked at DocumentParserTests.testDynamicFieldsStartingAndEndingWithDot the error appears to create two [.]'s ('top..foo..bar'). My fix doesn't do this and I don't think it should be doing that.

Can an Elastic team member please advise me on how to move forward? I don't want to submit a PR until all of these tests pass. @rjernst

jtibshirani commented 5 years ago

@danlevy1 I'm sorry that it has taken us so long to get back to you. Are you still interested in working on a fix for this issue (and #21862)?

If so, it would be great to open a PR with your current approach. You can make a note on the PR that certain tests are failing, and we can discuss how to debug them there. If you'd like, you can open a 'Draft' PR to make it clear it's a work in progress.

jtibshirani commented 4 years ago

@danlevy1 another friendly ping to see if you'd like to pick this up, otherwise I'll plan to submit a PR. As a heads up, I opened #49946 to address the problem for direct 'put mapping' calls.

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-search (Team:Search)

elasticsearchmachine commented 1 month ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)