sul-dlss-deprecated / rialto-derivatives

Listens to SNS messages and moves data from Neptune to Solr and Postgres to power the RIALTO webapp
0 stars 1 forks source link
golang infrastructure rialto

Rialto derivatives

CircleCI

This project contains Lambda functions that migrate data from Neptune to Solr and Postgres when an appropriately formatted SQS message is received. In the RIALTO architecture these messages come from https://github.com/sul-dlss/rialto-trigger-rebuild when a full rebuild is needed or from https://github.com/sul-dlss/sparql-loader when a single entity needs to be updated.

Running a lambda on localstack

Localstack

Start localstack. If you're on a Mac, ensure you are running the docker daemon.

SERVICES=lambda,sns,sqs LAMBDA_EXECUTOR=docker localstack start

Blazegraph

Start Blazegraph. On AWS we would use Neptune, but Neptune is not yet a part of localstack.

Create the lambda zip files, upload and subscribe them to SNS topics

make
  1. Start localstack. If you're on a Mac, ensure you are running the docker daemon.

    SERVICES=lambda,sns LAMBDA_EXECUTOR=docker localstack start
  2. Setup environment for localstack

    export AWS_DEFAULT_REGION=us-east-1
    export AWS_ACCESS_KEY_ID=_not_needed_locally_
    export AWS_SECRET_ACCESS_KEY=_not_needed_locally_
  3. Upload zip and create function definitions

    
    aws lambda \
    --endpoint-url http://localhost:4574 create-function \
    --function-name f1 \
    --runtime go1.x \
    --role r1 \
    --handler postgres_derivative \
    --environment "Variables={\
    SPARQL_ENDPOINT=http://127.0.0.1:9999/blazegraph/namespace/kb/sparql, \
    RDS_DB_NAME=rialto_development, \
    RDS_USERNAME=postgres, \
    RDS_HOSTNAME=127.0.0.1, \
    RDS_PORT=5432, \
    RDS_PASSWORD=sekret}" \
    --zip-file fileb://postgres_derivative.zip

aws lambda \ --endpoint-url http://localhost:4574 create-function \ --function-name f2 \ --runtime go1.x \ --role r1 \ --handler solr_derivative \ --environment "Variables={SOLR_HOST=http://127.0.0.1:8983/solr,SOLR_COLLECTION=collection1,\ SPARQL_ENDPOINT=http://127.0.0.1:9999/blazegraph/namespace/kb/sparql}" \ --zip-file fileb://solr_derivative.zip


5. Create SNS topic

aws sns \ --endpoint-url=http://localhost:4575 create-topic \ --name data-update


6. Subscribe to SNS events

aws sns \ --endpoint-url=http://localhost:4575 subscribe \ --topic-arn arn:aws:sns:us-east-1:123456789012:data-update \ --protocol lambda \ --notification-endpoint arn:aws:lambda:us-east-1:000000000000:function:f1

aws sns \ --endpoint-url=http://localhost:4575 subscribe \ --topic-arn arn:aws:sns:us-east-1:123456789012:data-update \ --protocol lambda \ --notification-endpoint arn:aws:lambda:us-east-1:000000000000:function:f2


7. Start Solr and create a collection

gem install solr_wrapper solr_wrapper


8. Publish a Message

aws sns \ --endpoint-url=http://localhost:4575 publish \ --topic-arn arn:aws:sns:us-east-1:123456789012:data-update \ --message '{"Records": [{"EventSource": "foo", "Sns": { "Timestamp": "2014-05-16T08:28:06.801Z", "Message": "{\"Action\": \"touch\", \"Entities\": [\"http://sul.stanford.edu/rialto/agents/orgs/school-of-engineering\"]}" }}]}'


9. View output
When you go to http://127.0.0.1:8983/solr/collection1/select?q=*:*

You should see an item record with:

"_source":{"foo": "barfoo"}


10. Cleanup (necessary before you upload a newer version of the function)

aws lambda \ --endpoint-url=http://localhost:4574 delete-function \ --function-name f1


### Testing

go test ./...


#### Test database
The `database.dump` file was generated by checking out rialto-webapp and doing:
`pg_dump rialto_test > database.dump`

To restore it:

psql circle_test < database.dump


Alternatively, the test database can be run in a docker container:

To start db

docker run --rm --name rialto_test_db -e POSTGRES_DB=rialto_test -p "5432:5432" -e POSTGRES_USER=$USER -d postgres:9.6.2-alpine

To load test data

cat database.dump | docker exec -i rialto_test_db psql -U $USER rialto_test

Run tests

go test ./...

Stop container

docker stop rialto_test_db