Open metasj opened 4 years ago
These files showed up in assets.priorarchive.org but were not indexed. Joel is currently looking at it.
It may be worth noting here which FTP server was used; that way we can be sure we're looking at the correct set of logs (I believe there is only one active any longer, but still)
@slifty there are two ElasticBeanstalk Applications, one for v1 (called TikaServer) and one for v2 (called FileParser). I think the specific failures SJ is talking about are in v1
the logs there show a lot of errors (including process out of memory
), but none from 2019-10-25 when the 20 failures happened
Not sure if this is relevant; there are three errors in the CloudWatch entries for priorart-v2-prod-handle-new-sftp-file
from 10/25 which look like this:
2019-10-25T13:57:42.297Z 0f6ab8ad-12c8-416b-995c-ab35c912f3ae
{
"errorMessage": null,
"errorType": "NotFound",
"stackTrace": [
"Request.extractError (/var/runtime/node_modules/aws-sdk/lib/services/s3.js:565:35)",
"Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:106:20)",
"Request.emit (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:78:10)",
"Request.emit (/var/runtime/node_modules/aws-sdk/lib/request.js:683:14)",
"Request.transition (/var/runtime/node_modules/aws-sdk/lib/request.js:22:10)",
"AcceptorStateMachine.runTo (/var/runtime/node_modules/aws-sdk/lib/state_machine.js:14:12)",
"/var/runtime/node_modules/aws-sdk/lib/state_machine.js:26:10",
"Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:38:9)",
"Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:685:12)",
"Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:116:18)",
"Request.emit (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:78:10)",
"Request.emit (/var/runtime/node_modules/aws-sdk/lib/request.js:683:14)",
"Request.transition (/var/runtime/node_modules/aws-sdk/lib/request.js:22:10)",
"AcceptorStateMachine.runTo (/var/runtime/node_modules/aws-sdk/lib/state_machine.js:14:12)",
"/var/runtime/node_modules/aws-sdk/lib/state_machine.js:26:10",
"Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:38:9)",
"Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:685:12)",
"Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:116:18)",
"callNextListener (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:96:12)",
"IncomingMessage.onEnd (/var/runtime/node_modules/aws-sdk/lib/event_listeners.js:307:13)",
"emitNone (events.js:111:20)",
"IncomingMessage.emit (events.js:208:7)"
]
}
@slifty I just posted a new file; do you see any new entries?
Here's an error posted from /var/log/containers/server-68c637a59428-stdouterr.log
(@slifty - on which server?)
Joel wrote about this:
This is the file-parser getting rejected from connecting to the IPFS node we’re running... a persistent problem / I don’t think that’s why the uploads were failing - it’s something that sometimes prevents file-parser from starting up when you reboot it
priorart-file-parser@0.1.0 start /usr/src/tika-server node index.js environment: production true Thu, 07 Nov 2019 20:54:14 GMT sequelize deprecated String based operators are now deprecated. Please use Symbol based operators for better security, read more at http://docs.sequelizejs.com/manual/tutorial/querying.html#operators at node_modules/sequelize/lib/sequelize.js:242:13 Listening on port 8080 Failed to connect to IPFS node SyntaxError: Unexpected token < in JSON at position 0 at JSON.parse (
) at streamToValue (/usr/src/tika-server/node_modules/ipfs-http-client/src/utils/stream-to-json-value.js:25:18) at concat (/usr/src/tika-server/node_modules/ipfs-http-client/src/utils/stream-to-value.js:12:22) at ConcatStream. (/usr/src/tika-server/node_modules/concat-stream/index.js:37:43) at emitNone (events.js:111:20) at ConcatStream.emit (events.js:208:7) at finishMaybe (/usr/src/tika-server/node_modules/readable-stream/lib/_stream_writable.js:620:14) at afterWrite (/usr/src/tika-server/node_modules/readable-stream/lib/_stream_writable.js:466:3) at _combinedTickCallback (internal/process/next_tick.js:145:20) at process._tickDomainCallback (internal/process/next_tick.js:219:9) npm ERR! code ELIFECYCLE npm ERR! errno 1 npm ERR! priorart-file-parser@0.1.0 start: node index.js
npm ERR! Exit status 1 npm ERR! npm ERR! Failed at the priorart-file-parser@0.1.0 start script. npm ERR! This is probably not a problem with npm. There is likely additional logging output above. npm ERR! A complete log of this run can be found in: npm ERR! /root/.npm/_logs/2019-11-07T20_54_14_659Z-debug.log
This is a followup to #25 , in other contexts -- sometimes files successfully sent via FTP (and still on the FTP server) don't appear in search.
Sujith notes he saw errors in the GCP indexing service recently, and is forwarding. We need to see where ingestion failed, and to retest the process.