Closed holly-cummins closed 5 years ago
@holly-cummins Thanks for opening the issue. Generally, a 400
status code means that server received a malformed request which it couldn't apprehend. My suspicion is that there's a race condition somewhere that's causing some of the requests to become corrupted by the concurrent mechanism that's being employed.
Can you possibly fill us in on some details, such as:
Thanks.
Thanks!
We're seeing the issue on both Mac OS and linux (and locally and from IBM Cloud). We're using v3.4.5 of the SDK, but saw similar issues with earlier versions.
We did have a linux-only race condition in our code on file access that our unit tests showed up, so we're now pretty careful to wait for the right stream events before doing anything.
const params = {
images_file: fs.createReadStream(fileToUpload),
classifier_ids: [classifier_id],
threshold: MINIMUM_CONFIDENCE
};
// Wait for the stream to be available
return new Promise((resolve, reject) => {
params.images_file.on('open', () => {
this.classify(params)
.then( // do stuff with the result ... )
.then(() => resolve(result));
// ...
classify(params) {
return promiseRetry(
{ retries: MAX_RETRY_TIMES_FOR_TANK, factor: 1.4, minTimeout: 200 },
(retry, number) => {
return this.classifyWithoutRetry(params).catch(err => {
console.log(
'Error returned from Watson. Retrying (attempt',
number + '): ',
err
);
retry(err);
// TODO we seem to get this persistently, after the second 52 image. Setting the concurrency high makes it happen on the first image, but setting the concurrency low doesn't make it not happen.
});
}
).catch(err => {
console.error(
'Giving up with persistent error: ',
err,
'. Counts for this tank will not be accurate.'
);
return { images: [] };
});
}
classifyWithoutRetry(params) {
return new Promise((resolve, reject) => {
const startTime = Date.now();
this.watson.classify(params, (err, result) => {
// console.log('Watson call completed in time ', Date.now() - startTime);
if (err) {
reject(err);
} else {
resolve(result);
}
});
});
}
@holly-cummins I suspect this is a problem in the service. Can you try removing the cookie jar from base_service
? You will have to extend that object or just modify the one in node_modules
.
@holly-cummins were you able to try this solution? Did it work for you?
I believe I'm having a similar issue. I had an app that called two visual recognition services, the main classify one and then the face detection one. It started acting up oddly - one call would work quickly and another would hang. If I ran the test again, the service that worked was completely random. You can see a simplified version here where I've removed use of Promises and such.
https://gist.github.com/cfjedimaster/6c6cbb39608b35623f9d4e83deedf2df
As a follow up - around 5 minutes later, I do get a result from my code:
{ Error: <HTML><HEAD><TITLE>Error</TITLE></HEAD><BODY>
An error occurred while processing your request.<p>
Reference #27.4cbbce17.1541105610.407f434d
</BODY></HTML>
at Request._callback (C:\projects\recogtester\node_modules\watson-developer-cloud\lib\requestwrapper.js:113:21)
at Request.self.callback (C:\projects\recogtester\node_modules\watson-developer-cloud\node_modules\request\request.js:185:22)
at Request.emit (events.js:182:13)
at Request.<anonymous> (C:\projects\recogtester\node_modules\watson-developer-cloud\node_modules\request\request.js:1157:10)
at Request.emit (events.js:182:13)
at IncomingMessage.<anonymous> (C:\projects\recogtester\node_modules\watson-developer-cloud\node_modules\request\request.js:1079:12)
at Object.onceWrapper (events.js:273:13)
at IncomingMessage.emit (events.js:187:15)
at endReadableNT (_stream_readable.js:1092:12)
at process._tickCallback (internal/process/next_tick.js:63:19) code: 0, 'x-global-transaction-id': undefined } null
Ok I'm sorry - it's an issue with the read streams. For some reason if I make a new params block and do a read stream on a second image, it works perfectly. So maybe something changed in Node since I built my initial demo.
Now that we don't have cookies I think this issue can be marked as resolved.
Closing in light of the above comment and due to inactivity. If this problem persists, we can reopen.
I'm happy to be optimistic that this is fixed and close it, and I can re-open if I see the issue when I next stage the affected app.
We've switched from sending zipped files to the VR service to sending individual images, to work around a performance issue on the server side. This generally works well, but we're seeing a high rate of errors if we send multiple requests concurrently. Around in in 20 images will come back with a code 400 error. On retry the image succeeds as we'd expect.
Error: