pelias / whosonfirst

Importer for Who's on First gazetteer
MIT License
27 stars 43 forks source link

Parse Error while loading into Elasticsearch Cluster #457

Closed vrozental closed 5 years ago

vrozental commented 5 years ago

I got POST http://sdt-dev-elastic-001.test.pro:9200/_bulk => Parse Error while running npm start to import data into our Elastic cluster.

What might be the reason?

myuser@sdt-dev-elastic-001:~/pelias-setup/whosonfirst$ time npm start

> pelias-whosonfirst@0.0.0-development start /home/myuser/pelias-setup/whosonfirst
> ./bin/start

Elasticsearch ERROR: 2019-07-02T09:22:28Z
  Error: Request error, retrying
  POST http://sdt-dev-elastic-001.test.pro:9200/_bulk => Parse Error
      at Log.error (/home/myuser/pelias-setup/whosonfirst/node_modules/elasticsearch/src/lib/log.js:226:56)
      at checkRespForFailure (/home/myuser/pelias-setup/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:259:18)
      at HttpConnector.<anonymous> (/home/myuser/pelias-setup/whosonfirst/node_modules/elasticsearch/src/lib/connectors/http.js:164:7)
      at ClientRequest.wrapper (/home/myuser/pelias-setup/whosonfirst/node_modules/lodash/lodash.js:4935:19)
      at ClientRequest.emit (events.js:198:13)
      at Socket.socketOnData (_http_client.js:448:9)
      at Socket.emit (events.js:198:13)
      at addChunk (_stream_readable.js:288:12)
      at readableAddChunk (_stream_readable.js:269:11)
      at Socket.Readable.push (_stream_readable.js:224:10)

The importer settings from the pelias.json:

"whosonfirst": {
      "datapath": "/mnt/local_drive_xvdc/pelias/data/whosonfirst",
      "importPostalcodes": true,
      "maxDownloads": 8
    }

The Elastic version is

  "version" : {
    "number" : "5.6.16",
    "build_hash" : "3a740d1",
    "build_date" : "2019-03-13T15:33:36.565Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.1"
  },
missinglink commented 5 years ago

I have also seen these recently, usually two at the beginning of the import and not repeated? I'm not sure exactly why it's happening but it appears that the request is retried and eventually succeeds.

Nothing to worry about.

vrozental commented 5 years ago

Thank you @missinglink

orangejulius commented 5 years ago

Yeah, these messages are nothing to worry about. It turns out Elasticsearch 5 emits a LOT of deprecation warnings for our current schema, and does so using HTTP headers. Node.js recently added a max header size, which triggers these errors. You can read all the details in https://github.com/pelias/schema/pull/337#issuecomment-444313941

We'll be able to fix these errors soon by cleaning up some of the deprecation warnings now that we've completely dropped support for ES2. Until then, nothing to worry about