TheDeveloper / http-aws-es

Use the elasticsearch-js client with Amazon ES
https://www.npmjs.com/package/http-aws-es
MIT License
262 stars 131 forks source link

Bulk request fails with "Data must be a string or a buffer" #44

Open fiskeben opened 6 years ago

fiskeben commented 6 years ago

Hi,

I'm trying to use the /_bulk operation with ES 5.3 but I keep getting the error "Data must be a string or a buffer". I create my client the exact same way as the example in elastic/elasticsearch-es#274.

The error message doesn't give a lot but here it is:

Elasticsearch ERROR: 2017-10-26T08:18:04Z
  Error: Request error, retrying
  POST http://<my-es-instance>.amazonaws.com/_bulk => Data must be a string or a buffer
      at Log.error (/app/node_modules/elasticsearch/src/lib/log.js:225:56)
      at checkRespForFailure (/app/node_modules/elasticsearch/src/lib/transport.js:258:18)
      at cleanUp (/app/node_modules/http-aws-es/connector.js:58:9)
      at <anonymous>
      at process._tickDomainCallback (internal/process/next_tick.js:228:7)

Using trace logging I can see that my body is there and it gets logged in the error handler callback in elasticsearch-js.

I've tried to serialize the body myself before passing it to the bulk function but I get the same error. I don't pass any other parameters.

I haven't been able to trace down where the error actually occurs but I suspect it's in this.httpClient.handleRequest.

My code works fine without the AwsEsConnector (using a local Elasticsearch).

I'm beginning to suspect this could be a bug in the connector?

TheDeveloper commented 6 years ago

@fiskeben thanks for the report.

Any chance you can jump into the connector at runtime of your script and see what type params.body is around line 127?

I suspect it could be coming through there as something we don't expect.

I have tried to reproduce in a test but haven't yet found a payload that can break _bulk requests.

What version of elasticsearch-js do you have?

fiskeben commented 6 years ago

Thanks for the quick reply.

I added some logging just before the if (body) line, and below is the output (my logs are surrounded in =s). It seems like the body is a string to begin with but it fails anyway and the retry loses the body.

Elasticsearch DEBUG: 2017-10-26T10:53:10Z
  starting request {
    "method": "POST",
    "bulkBody": true,
    "path": "/_bulk",
    "body": "<redacted string>",
    "query": {}
  }

=================================================
Type: string
Body: <redacted>

=================================================
Elasticsearch TRACE: 2017-10-26T10:53:10Z
  -> POST http://<my-es>.amazonaws.com:80/_bulk
  <redacted>

  <- 0

Elasticsearch ERROR: 2017-10-26T10:53:10Z
  Error: Request error, retrying
  POST http://<my-es>.amazonaws.com/_bulk => Data must be a string or a buffer
      at Log.error (/app/node_modules/elasticsearch/src/lib/log.js:225:56)
      at checkRespForFailure (/app/node_modules/elasticsearch/src/lib/transport.js:258:18)
      at cleanUp (/app/connector.js:58:9)
      at <anonymous>
      at process._tickDomainCallback (internal/process/next_tick.js:228:7)

=================================================
Type: undefined
Body: undefined
=================================================
Elasticsearch TRACE: 2017-10-26T10:53:10Z
  -> HEAD http://<my-es>.amazonaws.com:80/

  <- 0

Elasticsearch WARNING: 2017-10-26T10:53:10Z
  Unable to revive connection: http://<my-es>.amazonaws.com/

Elasticsearch WARNING: 2017-10-26T10:53:10Z
  No living connections

Failed to update { Error: No Living connections
    at sendReqWithConnection (/app/node_modules/elasticsearch/src/lib/transport.js:225:15)
    at next (/app/node_modules/elasticsearch/src/lib/connection_pool.js:213:7)
    at _combinedTickCallback (internal/process/next_tick.js:131:7)
    at process._tickDomainCallback (internal/process/next_tick.js:218:9)
  message: 'No Living connections',
  body: undefined,
  status: undefined }

I'm using elasticsearch-js-13.3.1.

TheDeveloper commented 6 years ago

Ah this is great, thanks. So the body is going missing somehow before retry.

While we should try to fix that up, this does seem to be triggered by an issue connecting to your ES host. Can you try it with an https endpoint URL? Looks like there the host is attempting connection over http / :80. This might help resolve the immediate issue for you.

fiskeben commented 6 years ago

Using https didn't make any difference (but I suppose I should use it anyway, so thanks for pointing that out).

However, I finally managed to figure it out. I added some more logging and it turned out the error came from signing the request. More debugging showed that the region was undefined which led me to how I create the client. Turns out I used the wrong name for the awsConfig because I had lifted it from the issue I mentioned in my initial post... now it works like a charm.

Thanks for the help. I don't know if the retry thing is an actual issue or a side effect of my little stunt here. You're welcome to close the issue if you will.

nrodriguez commented 6 years ago

I've run into the same issue when using v2.x or 3.x.

21:59:37 0|service  | Elasticsearch ERROR: 2017-10-27T21:59:37Z
21:59:37 0|service  |   Error: Request error, retrying
21:59:37 0|service  |   POST ES_URL/_bulk => AWS Credentials error: Data must be a string or a buffer
21:59:37 0|service  |       at Log.error (/opt/app/current/node_modules/elasticsearch/src/lib/log.js:225:56)
21:59:37 0|service  |       at checkRespForFailure (/opt/app/current/node_modules/elasticsearch/src/lib/transport.js:258:18)
21:59:37 0|service  |       at HttpAmazonESConnector.<anonymous> (/opt/app/current/node_modules/http-aws-es/node6.js:76:11)
21:59:37 0|service  |       at bound (/opt/app/current/node_modules/elasticsearch/node_modules/lodash/dist/lodash.js:729:21)
21:59:37 0|service  |       at /opt/app/current/node_modules/http-aws-es/node6.js:104:9
21:59:37 0|service  |       at next (native)
21:59:37 0|service  |       at step (/opt/app/current/node_modules/http-aws-es/node6.js:21:191)
21:59:37 0|service  |       at /opt/app/current/node_modules/http-aws-es/node6.js:21:361
21:59:37 0|service  |       at process._tickDomainCallback (internal/process/next_tick.js:129:7)

This is the config we're passing to elasticsearch.Client

{ apiVersion: '5.1',
22:03:21 0|service  |   plugins: [],
22:03:21 0|service  |   defer: [Function: defer],
22:03:21 0|service  |   connectionClass: [Function: HttpAmazonESConnector],
22:03:21 0|service  |   amazonES:
22:03:21 0|service  |    { region: 'us-east-1',
22:03:21 0|service  |      credentials:
22:03:21 0|service  |       EnvironmentCredentials {
22:03:21 0|service  |         secretAccessKey: 'KEY_HERE',
22:03:21 0|service  |         expired: false,
22:03:21 0|service  |         expireTime: null,
22:03:21 0|service  |         accessKeyId: 'KEY_ID_HERE',
22:03:21 0|service  |         sessionToken: undefined,
22:03:21 0|service  |         envPrefix: 'AWS' } },
22:03:21 0|service  |   host: URL' }
recurrence commented 6 years ago

Any clue on this? I'm seeing this error as well but I don't see it outside AWS.

haoliangyu commented 6 years ago

I used to have the same issue but specifying the region resolved it (as @fiskeben suggested).

import AWS = require("aws-sdk");

AWS.config.region = "us-east-1";
zachguo commented 6 years ago

@haoliangyu Does the region have to be the region of ES instance?

rmelian commented 6 years ago

@TheDeveloper with default configuration as mentioned in docs is not working let es = require('elasticsearch').Client({ hosts: [ 'https://amazon-es-host.us-east-1.es.amazonaws.com' ], connectionClass: require('http-aws-es') }); only specifying the region is working local and using an AWS cluster URL

haoliangyu commented 6 years ago

@zachguo Not sure. I have not tried the others, but my EC2 instance's region.

chaddjohnson commented 6 years ago

This worked for me:

const client = elasticsearch.Client({
    hosts: process.env.ELASTICSEARCH_HOSTS,
    connectionClass: require('http-aws-es'),
    amazonES: {
        accessKey: process.env.S3_ACCESS_KEY_ID,
        secretKey: process.env.S3_SECRET_ACCESS_KEY
    },
    awsConfig: new AWS.Config({region: process.env.ELASTICSEARCH_REGION})
});
brentahiti commented 5 years ago

After digging into aws-sdk code, I found that:

AWS.config is an instance of new AWS.Config() so there's two ways to do it

const client = elasticsearch.Client({
    hosts: ['host'],
    connectionClass: require('http-aws-es'),
    awsConfig: new AWS.Config({
        accessKeyId: 'AKID', secretAccessKey: 'SECRET', region: 'us-west-2'
    })
});

or

AWS.config.update({
    credentials: new AWS.Credentials('AKID', 'SECRET',
    region: 'us-west-2'
});
const client = new elasticsearch.Client({
    hosts: ['host'],
    connectionClass: require('http-aws-es'),
    awsConfig: AWS.config
});
lionellloh commented 5 years ago

After digging into aws-sdk code, I found that:

AWS.config is an instance of new AWS.Config() so there's two ways to do it

const client = elasticsearch.Client({
    hosts: ['host'],
    connectionClass: require('http-aws-es'),
    awsConfig: new AWS.Config({
        accessKeyId: 'AKID', secretAccessKey: 'SECRET', region: 'us-west-2'
    })
});

or

AWS.config.update({
    credentials: new AWS.Credentials('AKID', 'SECRET',
    region: 'us-west-2'
});
const client = new elasticsearch.Client({
    hosts: ['host'],
    connectionClass: require('http-aws-es'),
    awsConfig: AWS.config
});

THIS FINALLY WORKED FOR ME, THANKS SO MUCH!