elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.15k stars 24.84k forks source link

Hex numbers in YAML requests are treated as strings #66555

Open andrewkroh opened 3 years ago

andrewkroh commented 3 years ago

Elasticsearch version (bin/elasticsearch --version):

{
  "number": "7.11.0-SNAPSHOT",
  "build_flavor": "default",
  "build_type": "docker",
  "build_hash": "00e58bef3ac3f0bd7ab68b90e38e11cac05fb49e",
  "build_date": "2020-12-15T14:29:43.737519Z",
  "build_snapshot": true,
  "lucene_version": "8.7.0",
  "minimum_wire_compatibility_version": "6.8.0",
  "minimum_index_compatibility_version": "6.0.0-beta1"
}

Plugins installed: []

JVM version (java -version):

{
  "version": "15.0.1",
  "vm_name": "OpenJDK 64-Bit Server VM",
  "vm_version": "15.0.1+9",
  "vm_vendor": "AdoptOpenJDK",
  "bundled_jdk": true,
  "using_bundled_jdk": true,
  "count": 1
}

OS version (uname -a if on a Unix-like system): Our official Docker container based on CentOS Linux 8

Description of the problem including expected versus actual behavior:

When Elasticsearch unmarshals YAML scalar values in hexadecimal format it interprets the value as a string, but it should be a number. For example 0x1 becomes "0x01" when it should a 1. This causes problems when you expected the data type to be a number (like when reading the param value within painless).

Reference: https://yaml.org/spec/1.2/spec.html#id2766934

Relates: https://github.com/elastic/kibana/issues/85486

Steps to reproduce:

  1. Create an ingest pipeline using Content-Type: application/yaml. Include a hexidecimal number.
curl \
  -u elastic:changeme \
  -H 'Content-Type: application/yaml' \
  -X PUT \
  "localhost:9200/_ingest/pipeline/yaml_test" \
  -d'---

description: YAML unmarshal bug demo
processors:
  - script:
      lang: painless
      params:
        number_base10: 1
        number_base16: 0x1
        string: hello
      source: >
        params.forEach((k, v) -> ctx[k] = v);'
  1. Read the pipeline.
curl -u elastic:changeme -H 'Content-Type: application/yaml' -X GET "localhost:9200/_ingest/pipeline/yaml_test"
---
yaml_test:
  description: "YAML unmarshal bug demo"
  processors:
  - script:
      lang: "painless"
      params:
        number_base10: 1
        number_base16: "0x1"
        string: "hello"
      source: "params.forEach((k, v) -> ctx[k] = v);"
$ curl -u elastic:changeme -X GET "localhost:9200/_ingest/pipeline/yaml_test?pretty"
{
  "yaml_test" : {
    "description" : "YAML unmarshal bug demo",
    "processors" : [
      {
        "script" : {
          "lang" : "painless",
          "params" : {
            "number_base10" : 1,
            "number_base16" : "0x1",
            "string" : "hello"
          },
          "source" : "params.forEach((k, v) -> ctx[k] = v);"
        }
      }
    ]
  }
}

Note that number_base16 is "0x1". It should a 1.

andrewkroh commented 3 years ago

I was curious so I started looking upstream and found https://github.com/FasterXML/jackson-dataformats-text/issues/233 which appears to have been fixed in 2.12.

https://github.com/elastic/elasticsearch/blob/feab123ba400b150f3dcd04dd27cf57474b70d5a/buildSrc/version.properties#L12

elasticmachine commented 3 years ago

Pinging @elastic/es-core-infra (Team:Core/Infra)

rjernst commented 3 years ago

Thanks for reporting and the investigation upstream @andrewkroh. It looks like 2.12 was just released a few weeks ago. I'm going to mark this as help wanted so anyone can pick it up if they feel so empowered. In theory, simply bumping our Jackson version should enable support.

jamesrowe08 commented 3 years ago

Bumped version of jackson and tested this. Please have a look @elasticmachine

rjernst commented 3 years ago

72995 has to be reverted, so the upgrade of Jackson fixing this hex issue needs to be reopened.