Error:
I found this bug while using the client.helpers.bulk helper function. When sending a batch of records into the function I was met with the following error: DeserializationError: Unexpected token u in JSON at position 0. This error is the result of trying to use the bulk helper on an array of records with both INSERT and DELETE operations.
Code:
I was able to reproduce my error in the following script using a basic mock client and a simple batch of records as input.
$ ./test.js
{
status: 404,
error: undefined,
operation: { index: { _index: 'TEST_INDEX_NAME', _id: '1' } },
document: { id: '1', key: 'test1' },
retried: false
}
{
status: 404,
error: undefined,
operation: { id: '1', key: 'test1' },
document: null,
retried: false
}
/@opensearch-project/opensearch/lib/Serializer.js:65
throw new DeserializationError(err.message, json)
^
DeserializationError: Unexpected token u in JSON at position 0
at Serializer.deserialize (/@opensearch-project/opensearch/lib/Serializer.js:65:13)
at /@opensearch-project/opensearch/lib/Helpers.js:739:34
at onBody (/@opensearch-project/opensearch/lib/Transport.js:368:9)
at Class.onEnd (/@opensearch-project/opensearch/lib/Transport.js:285:11)
...
data: undefined
}
Problem:
In the spirit of trying to make fewer but larger bulk calls in order to maximize efficiency, I added a new field (operation) to each 'delete' record in my lambda handler function. This lets me send all records, regardless of their operation, into one client.helpers.bulk call (as can be seen above). However, client.helpers.bulk produces this error for batches of records that contain both types of operations.
Stepping through the code, I was able to pinpoint this error as coming from a nonuniform stride of two arrays, bulkBody and items from Helper.js. In this line, opensearch is trying to create an index (indexSlice) for bulkBody to find a document that corresponds to the current index of items. Since the batch has both index and delete operations, bulkBody is an array of operation objects AND document objects (because indexed records add both to this array, but deleted records only add one). So, you no longer can use just the index of a record in items to find its corresponding document in bulkBody.
In the event of an error, this incorrect indexSlice value causes problems when trying to grab the right document to retry and trying to display the correct OnDrop output, seen here. In my case, the value pulled from the indexSlice + 1 index of bulkBody is undefined (it is out of the scope of the array). Therefore, the value undefined is unable to deserializehere and the DeserializationError occurs.
Solution:
Maybe indexSlice can be calculated correctly by adding some way to count how many records in the array were 'deletes' so far and doing some fancy math!
Error: I found this bug while using the
client.helpers.bulk
helper function. When sending a batch of records into the function I was met with the following error:DeserializationError: Unexpected token u in JSON at position 0
. This error is the result of trying to use the bulk helper on an array of records with both INSERT and DELETE operations.Code: I was able to reproduce my error in the following script using a basic mock client and a simple batch of records as input.
The output of this script (
test.js
) is:Problem: In the spirit of trying to make fewer but larger bulk calls in order to maximize efficiency, I added a new field (
operation
) to each 'delete' record in my lambda handler function. This lets me send all records, regardless of their operation, into oneclient.helpers.bulk
call (as can be seen above). However,client.helpers.bulk
produces this error for batches of records that contain both types of operations.Stepping through the code, I was able to pinpoint this error as coming from a nonuniform stride of two arrays,
bulkBody
anditems
fromHelper.js
. In this line, opensearch is trying to create an index (indexSlice
) forbulkBody
to find a document that corresponds to the current index ofitems
. Since the batch has both index and delete operations,bulkBody
is an array of operation objects AND document objects (because indexed records add both to this array, but deleted records only add one). So, you no longer can use just the index of a record initems
to find its corresponding document inbulkBody
.In the event of an error, this incorrect
indexSlice
value causes problems when trying to grab the right document to retry and trying to display the correctOnDrop
output, seen here. In my case, the value pulled from theindexSlice + 1
index ofbulkBody
isundefined
(it is out of the scope of the array). Therefore, the valueundefined
is unable todeserialize
here and theDeserializationError
occurs.Solution: Maybe
indexSlice
can be calculated correctly by adding some way to count how many records in the array were 'deletes' so far and doing some fancy math!