HeardLibrary / vandycite

0 stars 0 forks source link

Configure Neptune instance #57

Closed baskaufs closed 2 years ago

baskaufs commented 2 years ago
  1. In Neptune, create database. Chose a version not the latest release.
  2. Instance size to large.
  3. Defaults on almost everything, left port at 8182
  4. Changed default on notebook to medium.
  5. Role, appended triplestore1 to name.
  6. tagged owner as DISC
  7. unchecked the "enable deletion protection".

Now went into VPC to create an endpoint for the S3 bucket access.

  1. Service name is com.amazonaws.us-east-1.s3
  2. Chose "gateway" type.
  3. Used default VPC, which is for our account and region (?).

~The notebook we created didn't work. The error message was "Failure reason The Notebook Instance type 'ml.t3.xlarge' is not available in the availability zone 'us-east-1e'. We apologize for the inconvenience. Please try again using subnet in a different availability zone, or try a different instance type." and we got it with every size type up to xlarge (medium, large, xlarge).~

Problem fixed by putting the notebook in the correct availability zone.

baskaufs commented 2 years ago

@CliffordAnderson @awesolek2 It appears that it's not possible for clients outside the VPC to connect to Neptune. See this page about connecting via a Load Balancer and this page about accessing via a Lambda function. Of the two options, the load balancer seems like it would be the simplest since I think you'd have to write your own Lambda.

CliffordAnderson commented 2 years ago

Yes, I agree that using a load balancer seems like the right approach. Thanks for researching these alternatives.

baskaufs commented 2 years ago

Test query for loading data into triplestore using Sagemaker notebook:

%%sparql

LOAD <https://iiif-library-manifests.s3.amazonaws.com/format.nq> INTO GRAPH <http://format>
baskaufs commented 2 years ago

When we moved the Neptune instance to us-east-1, the loading test in the Jupyter notebook didn't work. However, there were several actions we took:

I'm not sure which if these actions were necessary, but after we did them, I could issue the load command for S3 from the Jupyter notebook successfully as we did with the us-east-2 instance of Neptune.