aws / amazon-neptune-sigv4-signer

A library for Amazon Neptune that enables AWS Signature Version 4 signing for HTTP using Netty.
Apache License 2.0
16 stars 9 forks source link

Facing 403 Forbidden when SigV4 is used #22

Closed dayanfcosta closed 6 months ago

dayanfcosta commented 6 months ago

Hi all, I recently upgraded my cluster from version 1.1.0.x to 1.3.x.x and am facing issues when bumping my gremlin driver to version 3.6.x.

The main issue is that I'm following what your documentation says, to use a handshake interceptor and use the NeptuneNettyHttpSigV4Signer to sign the request, but it's not working, I've got only 403 from the server, and all the configurations are right as they should.

Before upgrading, the project was using the SigV4WebSocketChannelizer which is not working anymore, and this led me to think that this new NeptuneNettyHttpSigV4Signer does not work with WebSocket requests. Is that correct?

triggan commented 6 months ago

The new NeptuneNettyHttpSigV4Signer does work with WebSocket requests. I just attempted the following against a new cluster and this worked. This is a bit of a hybrid of the two Java examples we have in the Neptune docs. I'll work to get that documentation updated. Let me know if you are still seeing issues even after using this approach. Or feel free to provide any other code examples where you may be taking a different approach and expecting it to work the same.

Cluster cluster = Cluster.build()
                 .addContactPoint("neptune-cluster-endpoint.us-west-2.neptune.amazonaws.com")
                 .enableSsl(true)
                 .port(8182)
                 .handshakeInterceptor( r ->
                  {
                    try {
                      NeptuneNettyHttpSigV4Signer sigV4Signer =
                        new NeptuneNettyHttpSigV4Signer("us-west-2", new DefaultAWSCredentialsProviderChain());
                      sigV4Signer.signRequest(r);
                    } catch (NeptuneSigV4SignerException e) {
                      throw new RuntimeException("Exception occurred while signing the request", e);
                    }
                    return r;
                  }
                 ).create();

    GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using(cluster));

    // Add a vertex.
    // Note that a Gremlin terminal step, e.g. iterate(), is required to make a request to the remote server.
    // The full list of Gremlin terminal steps is at https://tinkerpop.apache.org/docs/current/reference/#terminal-steps
    g.addV("Person").property("Name", "Justin").iterate();

    // Add a vertex with a user-supplied ID.
    g.addV("Custom Label").property(T.id, "CustomId1").property("name", "Custom id vertex 1").iterate();
    g.addV("Custom Label").property(T.id, "CustomId2").property("name", "Custom id vertex 2").iterate();

    g.addE("Edge Label").from(__.V("CustomId1")).to(__.V("CustomId2")).iterate();

    // This gets the vertices, only.
    GraphTraversal t = g.V().limit(3).elementMap();

    t.forEachRemaining(
      e ->  System.out.println(t.toList())
    );

    cluster.close();
dayanfcosta commented 6 months ago

Hi @triggan, my Cluster configuration class is the same as yours as you can see below. By the way, it's a piece of Kotlin code, responsible for creating the DriverRemoteConnection.

fun remoteConnection(): DriverRemoteConnection? {
        logger.info("Creating neptune connection with the following parameters: $neptuneProperties")
        val builder = Cluster.build()
            .addContactPoint(neptuneProperties.endpoint)
            .port(neptuneProperties.port)
            .maxInProcessPerConnection(neptuneProperties.maxInProcessPerConnection)
            .maxSimultaneousUsagePerConnection(neptuneProperties.maxSimultaneousUsagePerConnection)
            .minConnectionPoolSize(neptuneProperties.minPoolSize)
            .maxConnectionPoolSize(neptuneProperties.maxPoolSize)
            .maxContentLength(65536 * 1000) // This is 1000x the default value, it intends to avoid CorruptedFrameException
            .serializer(Serializers.GRAPHBINARY_V1D0)
            .connectionSetupTimeoutMillis(10000)

        if (neptuneProperties.serviceRegion.isNotBlank()) {
            builder.enableSsl(true)
            builder.handshakeInterceptor { request ->
                val signer =
                    NeptuneNettyHttpSigV4Signer(neptuneProperties.serviceRegion, DefaultAWSCredentialsProviderChain())
                signer.signRequest(request)
                request
            }
        }

        return DriverRemoteConnection.using(builder.create())
    }

I added some logs on the interceptor before and after signing the request, and here are the requests logged.

before sign

DefaultFullHttpRequest(decodeResult: success, version: HTTP/1.1, content: EmptyByteBufBE)
GET /gremlin HTTP/1.1
User-Agent: NotAvailable Gremlin-Java.3.6.2 17.0.10 Linux.5.10.209-198.812.amzn2.x86_64 amd64
host: my-cluster-endpoint.eu-west-1.neptune.amazonaws.com:8182
upgrade: websocket
connection: upgrade
sec-websocket-key: generated key
origin: https://my-cluster-endpoint.eu-west-1.neptune.amazonaws.com:8182
sec-websocket-version: 13

after sign

DefaultFullHttpRequest(decodeResult: success, version: HTTP/1.1, content: EmptyByteBufBE)
GET /gremlin HTTP/1.1
User-Agent: NotAvailable Gremlin-Java.3.6.2 17.0.10 Linux.5.10.209-198.812.amzn2.x86_64 amd64
upgrade: websocket
connection: upgrade
sec-websocket-key: generated key
origin: https://my-cluster-endpoint.eu-west-1.neptune.amazonaws.com:8182
sec-websocket-version: 13
Host: my-cluster-endpoint.eu-west-1.neptune.amazonaws.com:8182
X-Amz-Date: 20240307T113705Z
Authorization: AWS4-HMAC-SHA256 Credential=credential/20240307/eu-west-1/neptune-db/aws4_request, SignedHeaders=host;origin;sec-websocket-key;sec-websocket-version;upgrade;user-agent;x-amz-date;x-amz-security-token, Signature=signature
X-Amz-Security-Token: token

PS: some values were omitted for security reasons

triggan commented 6 months ago

That appears correct.

Have you validated through any other means that IAM Auth is working properly after the upgrade? Perhaps via using awscurl to send an auth'd request to the /status endpont?

awscurl https://neptune-endpoint:port/status --service neptune-db --region eu-west-1

There was a change with IAM policies made in version 1.2.0.0 [1] where the connect action has been deprecated. Ensure that you're not using the connect action and using the proper data access policies[2]. I've seen this one cause some issues during an upgrade as it will also return a 403. There's typically a more verbose error message with the 403 response when this issue happens. Hence, I would just ensure that IAM is working through another means before ruling out the IAM config.

[1] https://docs.aws.amazon.com/neptune/latest/userguide/engine-releases-1.2.0.0.html [2] https://docs.aws.amazon.com/neptune/latest/userguide/iam-dp-actions.html

dayanfcosta commented 6 months ago

Ooh, I may have misread this release, I'll apply the changes to my IAM Role and test again. Thanks

dayanfcosta commented 6 months ago

I updated the IAM role with the granular permissions and it worked. I'm closing this one, thanks for the support.