jncc / datahub

The JNCC datahub - our online web repository of open data and publications.
1 stars 1 forks source link

Queue for batch search updates #53

Closed completer closed 5 years ago

completer commented 5 years ago

We need a queue to handle the website republish.

completer commented 5 years ago

Looks like we will need to use S3 to store PDF data

completer commented 5 years ago

https://github.com/raol/amazon-sqs-net-extended-client-lib

mattdebont commented 5 years ago

Lambda ingester now deployed and running off live queue, barring any problems with merge or from C6 / datahub this is pretty much done

mattdebont commented 5 years ago

Have a working Java version of the ingestion lambda with a working Tika text parser to extract file contents when > 10MB

See Issue #52 for further details