amazon-archives / sql-jdbc

🔍 Open Distro for Elasticsearch JDBC Driver
Apache License 2.0
111 stars 49 forks source link

Post submission only supports iso-8859-1 encoding #54

Closed ningpengtao-coder closed 4 years ago

ningpengtao-coder commented 4 years ago

ApacheHttpTransport line 263 HttpPost.setEntity when creating a StringEntity without specifying the encoding, ISO-8859-1 is used by default and no encoding can be set such as UTF-8.

I am using version 1.4.0.0

1.4.0.0:

ApacheHttpTransport: request.setEntity(new StringEntity(body));

StringEntity:

public StringEntity(final String string)
           throws UnsupportedEncodingException {
      this(string, ContentType.DEFAULT_TEXT);
}

public StringEntity(final String string, final ContentType contentType) throws UnsupportedCharsetException {
        super();
        Args.notNull(string, "Source string");
        Charset charset = contentType != null ? contentType.getCharset() : null;
        if (charset == null) {
            charset = HTTP.DEF_CONTENT_CHARSET;
        }
        this.content = string.getBytes(charset);
        if (contentType != null) {
            setContentType(contentType.toString());
        }
    }

ContentType: public static final ContentType DEFAULT_TEXT = TEXT_PLAIN; public static final ContentType TEXT_PLAIN = create( "text/plain", Consts.ISO_8859_1);

expected: request.setEntity(new StringEntity(body,ContentType.APPLICATION_JSON));

dai-chen commented 4 years ago

@PengtaoNing Thanks for reporting the issue! Could you provide some sample data and use case where you met the problem in our JDBC driver?

ningpengtao-coder commented 4 years ago

@PengtaoNing Thanks for reporting the issue! Could you provide some sample data and use case where you met the problem in our JDBC driver?

POST _bulk {"create":{"_index":"sql_data","_id":1}} {"name":"中文1"} {"create":{"_index":"sql_data","_id":2}} {"name":"中文2"} {"create":{"_index":"sql_data","_id":3}} {"name":"中文3"} {"create":{"_index":"sql_data","_id":4}} {"name":"中文4"} {"create":{"_index":"sql_data","_id":5}} {"name":"中文5"}

correct

kibana: POST _opendistro/_sql { "query": "SELECT * FROM sql_data where name = '中文1'" }

correct

curl: curl -XPOST "http://localhost:9200/_opendistro/_sql" -H 'Content-Type: application/json' -d'{ "query": "SELECT * FROM sql_data where name = \"中文1\""}'

incorrect

jdbc: Connection connection = DriverManager.getConnection(jdbcElasticsearchURL);

Statement statement = connection.createStatement();

ResultSet resultSet = statement.executeQuery("SELECT * FROM sql_data where name = '中文1'");

default charset: ISO-8859-1

The JDBC SQL statement is incorrect because the SQL statement is encoded to:

SELECT * FROM sql_data where name = '????1'

dai-chen commented 4 years ago

@PengtaoNing Thanks for the info! Will try to reproduce from my side.

penghuo commented 4 years ago

The issue has been fixed with #68. Feel free to reopen it.