Closed r3oath closed 6 years ago
Can you please post the code you're using to generate the call to ->batchDetectSentiment()
?
There are a few moving parts, but it basically boils down to
$scores = $this->getSentimentScores(
$this->client->batchDetectSentiment([
'LanguageCode' => $this->language,
'TextList' => $this->processComments($comments),
])
);
$this->language
returns 'en'.
$this->processComments($comments)
returns an array of strings, e.g:
["I really love that shirt", "Foo bar is a good company"]
So I can shed a little more light on this issue. I swapped out the batchDetectSentiment
call with detectSentiment
, passing the same comment data (albeit one comment at a time) and I'm not getting the described error.
So it seems to be happening with batched requests only. Judging by the response status code and the fact I am passing all data as expected/required, the strings are normalised (detectSentiment
works correctly with all of them) – the issue is more than likely happening deeper in the SDK, or perhaps even with Guzzle.
I was able to successfully call the ->batchDetectSentiment()
operation as follows:
$client->batchDetectSentiment([
'LanguageCode' => 'en',
'TextList' => ["I really love that shirt", "Foo bar is a good company"],
]);
Please make sure the structure for your operation has the correct input data. You can also use the 'debug'
flag to help in this process.
The input I'm passing is per the API specifications: a list of strings. Here's an example list which fails for me, and the results of running the request with debug
enabled.
List of strings:
array(4) {
[5]=>
string(47) "Speed limits have nothing to do with tiredness"
[6]=>
string(61) "False They generally happen because of draconian speed limits"
[7]=>
string(38) "False some drivers start driving tired"
[8]=>
string(26) "They forget to take breaks"
}
Debug results:
-> Entering step init, name 'idempotency_auto_fill'
---------------------------------------------------
command was set to array(3) {
["instance"]=>
string(32) "[CUT]"
["name"]=>
string(20) "BatchDetectSentiment"
["params"]=>
array(3) {
["LanguageCode"]=>
string(2) "en"
["TextList"]=>
array(4) {
[5]=>
string(47) "Speed limits have nothing to do with tiredness"
[6]=>
string(61) "False They generally happen because of draconian speed limits"
[7]=>
string(38) "False some drivers start driving tired"
[8]=>
string(26) "They forget to take breaks"
}
["@http"]=>
array(1) {
["debug"]=>
resource(493) of type (stream)
}
}
}
request was set to array(0) {
}
-> Entering step validate, name 'validation'
--------------------------------------------
no changes
-> Entering step build, name 'builder'
--------------------------------------
request.instance was set to [CUT]
request.method was set to POST
request.headers was set to array(4) {
["X-Amz-Security-Token"]=>
string(7) "[TOKEN]"
["Host"]=>
array(1) {
[0]=>
string(34) "comprehend.eu-west-1.amazonaws.com"
}
["X-Amz-Target"]=>
array(1) {
[0]=>
string(40) "Comprehend_20171127.BatchDetectSentiment"
}
["Content-Type"]=>
array(1) {
[0]=>
string(26) "application/x-amz-json-1.1"
}
}
request.body was set to {"LanguageCode":"en","TextList":{"5":"Speed limits have nothing to do with tiredness","6":"False They generally happen because of draconian speed limits","7":"False some drivers start driving tired","8":"They forget to take breaks"}}
request.scheme was set to https
-> Entering step build, name ''
-------------------------------
request.instance changed from [CUT] to [CUT]
request.headers.User-Agent was set to array(1) {
[0]=>
string(19) "aws-sdk-php/3.52.23"
}
-> Entering step sign, name 'invocation-id'
-------------------------------------------
request.instance changed from [CUT] to [CUT]
request.headers.aws-sdk-invocation-id was set to array(1) {
[0]=>
string(32) "[CUT]"
}
-> Entering step sign, name 'retry'
-----------------------------------
request.instance changed from [CUT] to [CUT]
request.headers.aws-sdk-retry was set to array(1) {
[0]=>
string(3) "0/0"
}
-> Entering step sign, name 'signer'
------------------------------------
request.instance changed from [CUT] to [CUT]
request.headers.X-Amz-Date was set to array(1) {
[0]=>
string(16) "20180315T053842Z"
}
request.headers.Authorization was set to array(1) {
[0]=>
string(247) "AWS4-HMAC-SHA256 Credential=[KEY]/20180315/eu-west-1/comprehend/aws4_request, SignedHeaders=aws-sdk-invocation-id;aws-sdk-retry;host;x-amz-date;x-amz-target, Signature=[SIGNATURE]
}
* Rebuilt URL to: https://comprehend.eu-west-1.amazonaws.com/
* Found bundle for host comprehend.eu-west-1.amazonaws.com: 0x7fb2ebe102a0 [can pipeline]
* Re-using existing connection! (#0) with host comprehend.eu-west-1.amazonaws.com
* Connected to comprehend.eu-west-1.amazonaws.com (54.194.137.78) port 443 (#0)
> POST / HTTP/1.1
Host: comprehend.eu-west-1.amazonaws.com
X-Amz-Target: Comprehend_20171127.BatchDetectSentiment
Content-Type: application/x-amz-json-1.1
aws-sdk-invocation-id: [CUT]
aws-sdk-retry: 0/0
X-Amz-Date: 20180315T053842Z
Authorization: AWS4-HMAC-SHA256 Credential=[KEY]/20180315/eu-west-1/comprehend/aws4_request, SignedHeaders=aws-sdk-invocation-id;aws-sdk-retry;host;x-amz-date;x-amz-target, Signature=[SIGNATURE]
User-Agent: aws-sdk-php/3.52.23 GuzzleHttp/6.2.1 curl/7.54.0 PHP/7.1.13
Content-Length: 234
* upload completely sent off: 234 out of 234 bytes
< HTTP/1.1 400 Bad Request
< Date: Thu, 15 Mar 2018 05:38:42 GMT
< Content-Type: application/x-amz-json-1.1
< Content-Length: 99
< Connection: keep-alive
< x-amzn-RequestId: [CUT]
<
* Connection #0 to host comprehend.eu-west-1.amazonaws.com left intact
<- Leaving step sign, name 'signer'
-----------------------------------
error was set to array(13) {
["instance"]=>
string(32) "[CUT]"
["class"]=>
string(44) "Aws\Comprehend\Exception\ComprehendException"
["message"]=>
string(497) "Error executing "BatchDetectSentiment" on "https://comprehend.eu-west-1.amazonaws.com"; AWS HTTP error: Client error: `POST https://comprehend.eu-west-1.amazonaws.com` resulted in a `400 Bad Request` response:
{"__type":"SerializationException","Message":"Start of structure or map found where not expected."}
SerializationException (client): Start of structure or map found where not expected. - {"__type":"SerializationException","Message":"Start of structure or map found where not expected."}"
["file"]=>
string(109) "[CUT]/vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php"
["line"]=>
int(191)
["trace"]=>
string(7414) "#0 [CUT]/vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php(100): Aws\WrappedHttpHandler->parseError(Array, Object(GuzzleHttp\Psr7\Request), Object(Aws\Command), Array)
#1 [CUT]/vendor/guzzlehttp/promises/src/Promise.php(203): Aws\WrappedHttpHandler->Aws\{closure}(Array)
#2 [CUT]/vendor/guzzlehttp/promises/src/Promise.php(174): GuzzleHttp\Promise\Promise::callHandler(2, Array, Array)
#3 [CUT]/vendor/guzzlehttp/promises/src/RejectedPromise.php(40): GuzzleHttp\Promise\Promise::GuzzleHttp\Promise\{closure}(Array)
#4 [CUT]/vendor/guzzlehttp/promises/src/TaskQueue.php(47): GuzzleHttp\Promise\RejectedPromise::GuzzleHttp\Promise\{closure}()
#5 [CUT]/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(96): GuzzleHttp\Promise\TaskQueue->run()
#6 [CUT]/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(123): GuzzleHttp\Handler\CurlMultiHandler->tick()
#7 [CUT]/vendor/guzzlehttp/promises/src/Promise.php(246): GuzzleHttp\Handler\CurlMultiHandler->execute(true)
#8 [CUT]/vendor/guzzlehttp/promises/src/Promise.php(223): GuzzleHttp\Promise\Promise->invokeWaitFn()
#9 [CUT]/vendor/guzzlehttp/promises/src/Promise.php(267): GuzzleHttp\Promise\Promise->waitIfPending()
#10 [CUT]/vendor/guzzlehttp/promises/src/Promise.php(225): GuzzleHttp\Promise\Promise->invokeWaitList()
#11 [CUT]/vendor/guzzlehttp/promises/src/Promise.php(267): GuzzleHttp\Promise\Promise->waitIfPending()
#12 [CUT]/vendor/guzzlehttp/promises/src/Promise.php(225): GuzzleHttp\Promise\Promise->invokeWaitList()
#13 [CUT]/vendor/guzzlehttp/promises/src/Promise.php(62): GuzzleHttp\Promise\Promise->waitIfPending()
#14 [CUT]/vendor/aws/aws-sdk-php/src/AwsClientTrait.php(58): GuzzleHttp\Promise\Promise->wait()
#15 [CUT]/vendor/aws/aws-sdk-php/src/AwsClientTrait.php(77): Aws\AwsClient->execute(Object(Aws\Command))
#16 [CUT]/app/Services/Aws.php(53): Aws\AwsClient->__call('batchDetectSent...', Array)
#17 [internal function]: App\Services\Aws->App\Services\{closure}(Array, 1)
#18 [CUT]/vendor/laravel/framework/src/Illuminate/Support/Collection.php(861): array_map(Object(Closure), Array, Array)
#19 [CUT]/app/Services/Aws.php(56): Illuminate\Support\Collection->map(Object(Closure))
#20 [CUT]/app/Console/Commands/CalculateSentimentCommand.php(82): App\Services\Aws->getAggregatedSentimentFor(Object(App\Objects\FacebookPost), Object(App\Console\Commands\CalculateSentimentCommand))
#21 [CUT]/app/Console/Commands/CalculateSentimentCommand.php(49): App\Console\Commands\CalculateSentimentCommand->packageSentimentFor(Object(App\Objects\FacebookPost))
#22 [internal function]: App\Console\Commands\CalculateSentimentCommand->App\Console\Commands\{closure}(Object(App\Objects\FacebookPost), 2)
#23 [CUT]/vendor/laravel/framework/src/Illuminate/Support/Collection.php(861): array_map(Object(Closure), Array, Array)
#24 [CUT]/app/Console/Commands/CalculateSentimentCommand.php(50): Illuminate\Support\Collection->map(Object(Closure))
#25 [internal function]: App\Console\Commands\CalculateSentimentCommand->handle()
#26 [CUT]/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(29): call_user_func_array(Array, Array)
#27 [CUT]/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(87): Illuminate\Container\BoundMethod::Illuminate\Container\{closure}()
#28 [CUT]/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(31): Illuminate\Container\BoundMethod::callBoundMethod(Object(Illuminate\Foundation\Application), Array, Object(Closure))
#29 [CUT]/vendor/laravel/framework/src/Illuminate/Container/Container.php(549): Illuminate\Container\BoundMethod::call(Object(Illuminate\Foundation\Application), Array, Array, NULL)
#30 [CUT]/vendor/laravel/framework/src/Illuminate/Console/Command.php(183): Illuminate\Container\Container->call(Array)
#31 [CUT]/vendor/symfony/console/Command/Command.php(252): Illuminate\Console\Command->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Illuminate\Console\OutputStyle))
#32 [CUT]/vendor/laravel/framework/src/Illuminate/Console/Command.php(170): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Illuminate\Console\OutputStyle))
#33 [CUT]/vendor/symfony/console/Application.php(946): Illuminate\Console\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#34 [CUT]/vendor/symfony/console/Application.php(248): Symfony\Component\Console\Application->doRunCommand(Object(App\Console\Commands\CalculateSentimentCommand), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#35 [CUT]/vendor/symfony/console/Application.php(148): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#36 [CUT]/vendor/laravel/framework/src/Illuminate/Console/Application.php(88): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#37 [CUT]/vendor/laravel/framework/src/Illuminate/Foundation/Console/Kernel.php(121): Illuminate\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#38 [CUT]/artisan(37): Illuminate\Foundation\Console\Kernel->handle(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#39 {main}"
["type"]=>
string(6) "client"
["code"]=>
string(22) "SerializationException"
["requestId"]=>
string(36) "[CUT]"
["statusCode"]=>
int(400)
["result"]=>
NULL
["request"]=>
array(5) {
["instance"]=>
string(32) "[CUT]"
["method"]=>
string(4) "POST"
["headers"]=>
array(9) {
["X-Amz-Security-Token"]=>
string(7) "[TOKEN]"
["Host"]=>
array(1) {
[0]=>
string(34) "comprehend.eu-west-1.amazonaws.com"
}
["X-Amz-Target"]=>
array(1) {
[0]=>
string(40) "Comprehend_20171127.BatchDetectSentiment"
}
["Content-Type"]=>
array(1) {
[0]=>
string(26) "application/x-amz-json-1.1"
}
["User-Agent"]=>
array(1) {
[0]=>
string(19) "aws-sdk-php/3.52.23"
}
["aws-sdk-invocation-id"]=>
array(1) {
[0]=>
string(32) "[CUT]"
}
["aws-sdk-retry"]=>
array(1) {
[0]=>
string(3) "0/0"
}
["X-Amz-Date"]=>
array(1) {
[0]=>
string(16) "20180315T053842Z"
}
["Authorization"]=>
array(1) {
[0]=>
string(247) "AWS4-HMAC-SHA256 Credential=[KEY]/20180315/eu-west-1/comprehend/aws4_request, SignedHeaders=aws-sdk-invocation-id;aws-sdk-retry;host;x-amz-date;x-amz-target, Signature=[SIGNATURE]
}
}
["body"]=>
string(234) "{"LanguageCode":"en","TextList":{"5":"Speed limits have nothing to do with tiredness","6":"False They generally happen because of draconian speed limits","7":"False some drivers start driving tired","8":"They forget to take breaks"}}"
["scheme"]=>
string(5) "https"
}
["response"]=>
array(4) {
["instance"]=>
string(32) "[CUT]"
["statusCode"]=>
int(400)
["headers"]=>
array(6) {
["X-Amz-Security-Token"]=>
string(7) "[TOKEN]"
["Date"]=>
array(1) {
[0]=>
string(29) "Thu, 15 Mar 2018 05:38:42 GMT"
}
["Content-Type"]=>
array(1) {
[0]=>
string(26) "application/x-amz-json-1.1"
}
["Content-Length"]=>
array(1) {
[0]=>
string(2) "99"
}
["Connection"]=>
array(1) {
[0]=>
string(10) "keep-alive"
}
["x-amzn-RequestId"]=>
array(1) {
[0]=>
string(36) "[CUT]"
}
}
["body"]=>
string(99) "{"__type":"SerializationException","Message":"Start of structure or map found where not expected."}"
}
}
Inclusive step time: 0.3991219997406
<- Leaving step sign, name 'retry'
----------------------------------
no changes
Inclusive step time: 0.39935398101807
<- Leaving step sign, name 'invocation-id'
------------------------------------------
no changes
Inclusive step time: 0.39951109886169
<- Leaving step build, name ''
------------------------------
no changes
Inclusive step time: 0.3996479511261
<- Leaving step build, name 'builder'
-------------------------------------
no changes
Inclusive step time: 0.39979195594788
<- Leaving step validate, name 'validation'
-------------------------------------------
no changes
Inclusive step time: 0.39995503425598
<- Leaving step init, name 'idempotency_auto_fill'
--------------------------------------------------
no changes
Inclusive step time: 0.40009808540344
In WrappedHttpHandler.php line 191:
Error executing "BatchDetectSentiment" on "https://comprehend.eu-west-1.amazonaws.com"; AWS HTTP error: Client error: `POST https://comprehend.eu-west-1.amazonaws.com` res
ulted in a `400 Bad Request` response:
{"__type":"SerializationException","Message":"Start of structure or map found where not expected."}
SerializationException (client): Start of structure or map found where not expected. - {"__type":"SerializationException","Message":"Start of structure or map found where
not expected."}
In RequestException.php line 113:
Client error: `POST https://comprehend.eu-west-1.amazonaws.com` resulted in a `400 Bad Request` response:
{"__type":"SerializationException","Message":"Start of structure or map found where not expected."}
Upon further testing, it seems the issue may be stemming from the fact that each batch detect in my application is performing its job on an array of chunked comments. Hence, as seen in the example above some arrays aren't indexed starting with zero.
Perhaps this detail should be noted in the documentation or array_values
called on the SDK side to ensure the indexing matches what the API is expecting.
The use of array_values
should be left up to the implementer in these cases. If we have this by default in the SDK, it means we'll be changing the structure of the data that you generated which may lead to other, more hidden, issues down the line.
The documentation also shows that this is an unindexed array.
$result = $client->batchDetectSentiment([
// Key Value
'LanguageCode' => '<string>', // REQUIRED
'TextList' => ['<string>', ...], // No keys
]);
// Shows sub-elements with indexes are represented differently.
...
[
// Key Value
'Index' => <integer>,
'Sentiment' => 'POSITIVE|NEGATIVE|NEUTRAL|MIXED',
'SentimentScore' => [
// Key Value
'Mixed' => <float>,
'Negative' => <float>,
'Neutral' => <float>,
'Positive' => <float>,
],
],
...
Every array declared in PHP is associative under the hood, eg: ["foo", "bar"]
= [0 => "foo", 1 => "bar"]
.
So if the API intrinsically requires sequential zero-first arrays, it either needs to make that apparent in the documentation, improve the error messaging, or from an SDK perspective call array_values
. If the order of strings is critical, then even an exception or warning from the SDK that the array passed isn't sequential and zero-based will go a long way with helping your end users debug bizarre situations like the one I faced.
And the end of the day, the fact that you can pass an array with a non-zero index and cause a SerializationException
on the AWS side speaks to this issue requiring a little bit of attention.
You saved my day! I've been battling this problem for days. Applying array_values
fixed the problem. Thank you a lot!
I'm trying to make use of the
BatchDetectSentiment
API, however I am constantly getting errors regularly when making requests (so some requests are going through fine). There seems to be no logical correlation between what I am sending and when the error occurs. In theBatchDetectSentiment
example, I am simply sending plain-text sentences. I've even gone as far as to remove all non (a-zA-Z0-1) characters to limit the chance that it isn't handling emojis (these sentences being processed come from the facebook API). I'm also sending no more than 5 documents per batch request, and limiting each document to a maximum of 5000 bytes.Any ideas? See the exception message below: