Closed sharkinsspatial closed 5 years ago
@sharkinsspatial, thanks for the progress. I'm just getting one odd problem.
I'm testing with 1 million tile indices. The upload yarn sqs-push
script runs without issue and says it uploaded all 1 million tiles. However, I see that a small and consistent number of messages are missing if I look at the SQS page in AWS console. Repeating the upload a couple times, I only ever see 996,500 available messages.
I didn't deploy the entire CFN pipeline -- I just made an SQS queue and tested out this specific functionality. I'm also not experienced with the finer controls on SQS queues. You might want to check to see if the queue config is off and would've led to that error. I the queue on AWS (under the name chip-n-scale-test
) if you want to have a look. I can also send you the text file of tile indices if you want to use it.
@wronk I added a commit which should fix this situation. This was a bug on my part, the writeable stream was correctly exerting backpressure but I was failing to capture the streamed records the first time the PROMISE_THRESHOLD
was exceeded. I also added some better error recording around sendMessageBatch
as it will always resolve with a 200
even if individual messages fail to insert, these individual message failures should be logged now if there is ever a message failure.
@sharkinsspatial, LGTM, confirmed that I can upload and see all the messages in the queue
@wronk If possible can you run a test on this branch to verify that this handles the memory growth issue you were experiencing. You can use the
PROMISE_THRESHOLD
to specify the maximum number of inflight SQS messages (you'll have to balance the number of parallel messages with the memory and IO restrictions here to find the optimal number. The default is 500).