markmcdowell / NLog.Targets.ElasticSearch

NLog target for Elasticsearch
MIT License
176 stars 89 forks source link

feat: better error handling from server bulk response. #149

Closed lucianaparaschivei closed 2 years ago

lucianaparaschivei commented 3 years ago

In some cases the BytesResponse flag Success is not enough to identify a failed index operation. We should rely on the BulkResponse that returns the items with errors.

For example this will not log anything in the internal logger with the current approach: Successful (200) low level call on POST: /_bulk {"took":0,"errors":true,"items":[{"index":{"_index":"logstash-2021.07.12","_type":"_doc","_id":"OTgmmnoBq3Eup03vE24t","status":403,"error":{"type":"security_exception","reason":"action [indices:data/write/bulk[s]] is unauthorized for API key id [NzgjmnoBq3Eup03vg25H] of user [elastic] on indices [logstash-2021.07.12], this action is granted by the index privileges [create_doc,create,delete,index,write,all]"}

But the BulkResponse has the Errors flag and will include in this case the failed item in the ItemsWithErrors collection.

Checklist

snakefoot commented 3 years ago

What is the performance overhead from using NEST BulkResponse ? (Prefer to optimize for success instead of error)

Would it not be possible to extract the same information from the ApiCall-property on BytesResponse ?

I guess it would make sense to stop using the LowLevel-Client and instead, change over to the NEST-interface. Since the LowLevel-Serializer fails to handle anything but simple data-types.

lucianaparaschivei commented 3 years ago

I have run some benchmarks and found no real overhead for using the BytesResponse. I would not switch to Nest client as it will add overhead and in reality it is using the low level client anyways. Also I do think it is important to spot errors in the process as we have seen intermittent missing elastic logs without any internal logging. Here are my results for writing 1 simple log text x 100 times (so enforcing many responses deserialization):

  1. BulkResponse with error: image

    BytesResponse with error: image

  2. BulkResponse success: image BytesResponse success: image

snakefoot commented 3 years ago

Amazing how sending 100 messages cost 2 MByte allocation. Also strange that BulkResponse does not increase allocation, since it should deserialize the actual response, where BytesResponse just contains the response-as-blob-data.

Have created the following issue https://github.com/elastic/elastic-transport-net/issues/30

lucianaparaschivei commented 3 years ago

FYI: 1000 messages will cost 15Mbytes

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

lucianaparaschivei commented 3 years ago

activate

snakefoot commented 3 years ago

@lucianaparaschivei Possible that the last commit feat: add support for OAuth tokens could be extracted into a new pull-request?

lucianaparaschivei commented 3 years ago

Sure, i added it by mistake.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.