Hi! Thanks for this project! I like the de-coupled architecture and that you do not modify the original document!
I was trying it out with some basic documents and hit a few snags.
When i drag any new document into the watch folder, the web app crashes:
webapp_1 | GET /api/v1/healthcheck 200 0.300 ms - 25
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Processing create event: \"/data/another test document.pdf\": CREATE" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Publishing event.." type=fs
thumbnail_processor_1 | time="2021-04-18T21:38:51Z" level=info msg="[x] {\"Records\":[{\"eventVersion\":\"2.0\",\"eventSource\":\"lodestone:publisher:fs\",\"awsRegion\":\"\",\"eventTime\":\"2021-04-18T21:38:51.056401437Z\",\"eventName\":\"s3:ObjectCreated:Put\",\"userIdentity\":{\"principalId\":\"lodestone\"},\"requestParameters\":{\"sourceIPAddress\":\"172.20.0.6\"},\"responseElements\":{},\"s3\":{\"s3SchemaVersion\":\"1.0\",\"configurationId\":\"Config\",\"bucket\":{\"name\":\"documents\",\"ownerIdentity\":{\"principalId\":\"lodestone\"},\"arn\":\"arn:aws:s3:::documents\"},\"object\":{\"key\":\"another test document.pdf\",\"size\":713801,\"urlDecodedKey\":\"\",\"versionId\":\"1\",\"eTag\":\"f5cc9d23214de3dfeee766519ad54614\",\"sequencer\":\"\"}}}]}" type=thumbnail
document_processor_1 | time="2021-04-18T21:38:51Z" level=info msg="[x] {\"Records\":[{\"eventVersion\":\"2.0\",\"eventSource\":\"lodestone:publisher:fs\",\"awsRegion\":\"\",\"eventTime\":\"2021-04-18T21:38:51.056401437Z\",\"eventName\":\"s3:ObjectCreated:Put\",\"userIdentity\":{\"principalId\":\"lodestone\"},\"requestParameters\":{\"sourceIPAddress\":\"172.20.0.6\"},\"responseElements\":{},\"s3\":{\"s3SchemaVersion\":\"1.0\",\"configurationId\":\"Config\",\"bucket\":{\"name\":\"documents\",\"ownerIdentity\":{\"principalId\":\"lodestone\"},\"arn\":\"arn:aws:s3:::documents\"},\"object\":{\"key\":\"another test document.pdf\",\"size\":713801,\"urlDecodedKey\":\"\",\"versionId\":\"1\",\"eTag\":\"f5cc9d23214de3dfeee766519ad54614\",\"sequencer\":\"\"}}}]}" type=document
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Publish confirmed!" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": CHMOD" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": WRITE" type=fs
webapp_1 | GET /api/v1/storage/documents/another%20test%20document.pdf 200 57.649 ms - 713801
thumbnail_processor_1 | time="2021-04-18T21:38:51Z" level=info msg="reading file: /tmp/thumb346362703/another test document.pdf" type=thumbnail
webapp_1 | GET /api/v1/storage/documents/another%20test%20document.pdf 200 40.601 ms - 713801
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": CHMOD" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": WRITE" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Processing create event: \"/data/another test document.pdf\": CREATE" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Publishing event.." type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Publish confirmed!" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": CHMOD" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": WRITE" type=fs
tika_1 | INFO tika (autodetecting type)
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": CHMOD" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": WRITE" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Processing create event: \"/data/another test document.pdf\": CREATE" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Publishing event.." type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Publish confirmed!" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": CHMOD" type=fs
fs_publisher_1 | time="2021-04-18T21:38:51Z" level=info msg="Ignoring event: \"/data/another test document.pdf\": WRITE" type=fs
tika_1 | WARN Using fallback font ArialMT for Symbol
webapp_1 | POST /api/v1/storage/thumbnails/another%20test%20document.pdf.jpg 200 17.395 ms - 1643
thumbnail_processor_1 | time="2021-04-18T21:38:52Z" level=info msg="[x] {\"Records\":[{\"eventVersion\":\"2.0\",\"eventSource\":\"lodestone:publisher:fs\",\"awsRegion\":\"\",\"eventTime\":\"2021-04-18T21:38:51.251961211Z\",\"eventName\":\"s3:ObjectCreated:Put\",\"userIdentity\":{\"principalId\":\"lodestone\"},\"requestParameters\":{\"sourceIPAddress\":\"172.20.0.6\"},\"responseElements\":{},\"s3\":{\"s3SchemaVersion\":\"1.0\",\"configurationId\":\"Config\",\"bucket\":{\"name\":\"documents\",\"ownerIdentity\":{\"principalId\":\"lodestone\"},\"arn\":\"arn:aws:s3:::documents\"},\"object\":{\"key\":\"another test document.pdf\",\"size\":713801,\"urlDecodedKey\":\"\",\"versionId\":\"1\",\"eTag\":\"f5cc9d23214de3dfeee766519ad54614\",\"sequencer\":\"\"}}}]}" type=thumbnail
webapp_1 | GET /api/v1/storage/documents/another%20test%20document.pdf 200 13.284 ms - 713801
thumbnail_processor_1 | time="2021-04-18T21:38:52Z" level=info msg="reading file: /tmp/thumb586604642/another test document.pdf" type=thumbnail
webapp_1 | /lodestone/node_modules/aws-sdk/lib/request.js:31
webapp_1 | throw err;
webapp_1 | ^
webapp_1 |
webapp_1 | Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client
webapp_1 | at ServerResponse.setHeader (_http_outgoing.js:530:11)
webapp_1 | at ServerResponse.header (/lodestone/node_modules/express/lib/response.js:767:10)
webapp_1 | at ServerResponse.send (/lodestone/node_modules/express/lib/response.js:170:12)
webapp_1 | at ManagedUpload.callback (/lodestone/routes/storage.js:47:17)
webapp_1 | at Response.finishSinglePart (/lodestone/node_modules/aws-sdk/lib/s3/managed_upload.js:674:28)
webapp_1 | at Request.<anonymous> (/lodestone/node_modules/aws-sdk/lib/request.js:364:18)
webapp_1 | at Request.callListeners (/lodestone/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
webapp_1 | at Request.emit (/lodestone/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
webapp_1 | at Request.emit (/lodestone/node_modules/aws-sdk/lib/request.js:683:14)
webapp_1 | at Request.transition (/lodestone/node_modules/aws-sdk/lib/request.js:22:10)
webapp_1 | at AcceptorStateMachine.runTo (/lodestone/node_modules/aws-sdk/lib/state_machine.js:14:12)
webapp_1 | at /lodestone/node_modules/aws-sdk/lib/state_machine.js:26:10
webapp_1 | at Request.<anonymous> (/lodestone/node_modules/aws-sdk/lib/request.js:38:9)
webapp_1 | at Request.<anonymous> (/lodestone/node_modules/aws-sdk/lib/request.js:685:12)
webapp_1 | at Request.callListeners (/lodestone/node_modules/aws-sdk/lib/sequential_executor.js:116:18)
webapp_1 | at Request.emit (/lodestone/node_modules/aws-sdk/lib/sequential_executor.js:78:10) {
webapp_1 | code: 'ERR_HTTP_HEADERS_SENT',
webapp_1 | time: 2021-04-18T21:38:52.353Z
webapp_1 | }
webapp_1 | npm ERR! code ELIFECYCLE
webapp_1 | npm ERR! errno 1
webapp_1 | npm ERR! lodestone-backend@0.0.0 start: `node ./bin/www`
webapp_1 | npm ERR! Exit status 1
webapp_1 | npm ERR!
webapp_1 | npm ERR! Failed at the lodestone-backend@0.0.0 start script.
webapp_1 | npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
webapp_1 |
webapp_1 | npm ERR! A complete log of this run can be found in:
webapp_1 | npm ERR! /root/.npm/_logs/2021-04-18T21_38_52_398Z-debug.log
loadstone_webapp_1 exited with code 1
thumbnail_processor_1 | time="2021-04-18T21:38:53Z" level=info msg="Error when processing document: Post \"http://webapp:3000/api/v1/storage/thumbnails/another%20test%20document.pdf.jpg\": dial tcp: lookup webapp on 127.0.0.11:53: no such host" type=thumbnail
thumbnail_processor_1 | time="2021-04-18T21:38:53Z" level=info msg="[x] {\"Records\":[{\"eventVersion\":\"2.0\",\"eventSource\":\"lodestone:publisher:fs\",\"awsRegion\":\"\",\"eventTime\":\"2021-04-18T21:38:51.40409119Z\",\"eventName\":\"s3:ObjectCreated:Put\",\"userIdentity\":{\"principalId\":\"lodestone\"},\"requestParameters\":{\"sourceIPAddress\":\"172.20.0.6\"},\"responseElements\":{},\"s3\":{\"s3SchemaVersion\":\"1.0\",\"configurationId\":\"Config\",\"bucket\":{\"name\":\"documents\",\"ownerIdentity\":{\"principalId\":\"lodestone\"},\"arn\":\"arn:aws:s3:::documents\"},\"object\":{\"key\":\"another test document.pdf\",\"size\":713801,\"urlDecodedKey\":\"\",\"versionId\":\"1\",\"eTag\":\"f5cc9d23214de3dfeee766519ad54614\",\"sequencer\":\"\"}}}]}" type=thumbnail
thumbnail_processor_1 | time="2021-04-18T21:38:53Z" level=info msg="Error when processing document: Get \"http://webapp:3000/api/v1/storage/documents/another%20test%20document.pdf\": dial tcp: lookup webapp on 127.0.0.11:53: no such host" type=thumbnail
If i re-start the webapp container then re-run the sync tool, there's a decent chance that it'll get picked up and parsed w/o a crash. Other times, a 'sync' operation will result in the same error:
I can trigger the same failure Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client
Please let me know if there's any additional debugging you'd like me to do / additional stack traces you'd like.
EDIT: Forgot to mention that this is with the images that are in your docker-compose.yaml file as of 2021-04-18 and while using docker for mac.
I believe this is related to #131 somewhat since if you create the data/storage/thumbnails/ folder it fixes the issues. Feels like a workaround though.
Hi! Thanks for this project! I like the de-coupled architecture and that you do not modify the original document!
I was trying it out with some basic documents and hit a few snags.
When i drag any new document into the watch folder, the web app crashes:
If i re-start the
webapp
container then re-run the sync tool, there's a decent chance that it'll get picked up and parsed w/o a crash. Other times, a 'sync' operation will result in the same error:I can trigger the same failure
Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client
Please let me know if there's any additional debugging you'd like me to do / additional stack traces you'd like.
EDIT: Forgot to mention that this is with the images that are in your
docker-compose.yaml
file as of 2021-04-18 and while using docker for mac.