ducktors / turborepo-remote-cache

Open source implementation of the Turborepo custom remote cache server.
https://ducktors.github.io/turborepo-remote-cache/
MIT License
1.06k stars 102 forks source link

HTTP Error 412 while using recent Turbo versions #323

Open trappar opened 8 months ago

trappar commented 8 months ago

πŸ› Bug Report

I updated Turbo from 1.10.16 to 1.12.5 today and started seeing this in CI:

 WARNING  failed to contact remote cache: Error making HTTP request: HTTP status client error (412 Precondition Failed) for url (http://0.0.0.0:45045/v8/artifacts/e946449d9e73b6d1?slug=ci)

There is some discussion around this in this turbo issue, where people mention that this is likely due to using this remote cache server along with S3 specifically.

To Reproduce

I doubt that it will be possible for me to create reproduction instructions / repo for this issue considering that others have failed to reliably reproduce this in the thread above.

Expected behavior

To not get the http status errors.

Your Environment

spacedawwwg commented 8 months ago

we've been having the same issue using Azure. Locked in at turbo v1.10 for now until we have time to fully investigate

matteovivona commented 8 months ago

Super weird. We're using a remote-cache server and Turbo 1.12.5 in other projects, and so far, we haven't had any problems at all

Screenshot 2024-03-21 at 10 15 10
trappar commented 8 months ago

I've found what is causing this in my particular case.

I have two different workflows. I was only seeing this issue appear in one of them.

Both of them utilize my GH Action to start a cache server like this:

- uses: trappar/turborepo-remote-cache-gh-action@v2
  env:
    AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
    AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
  with:
    storage-provider: s3
    storage-path: turborepo-cache

However, the one that was failing had the following proceeding it:

- name: Configure AWS credentials
  uses: aws-actions/configure-aws-credentials@v4
  with:
    role-to-assume: arn:aws:iam::${{ secrets.ACCOUNT_ID }}:role/[REDACTED]
    aws-region: us-east-1

If I simply switch the order of these so that the remote cache server starts before configuring AWS credentials, then the error disappears.

So this may or may not be a bug depending on which credentials should take precedence. I assumed that the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY env variables would take precedence over anything else, but that is clearly incorrect. I'm not actually 100% sure what the output of that aws action is (does it create an ~/.aws/credentials file or something?), but it looks like it's taking over, and those credentials don't have permission to view the S3 bucket I'm telling the cache server to use.

Regardless of if this app is doing something wrong or not, it does seem like there's room to improve the handling of this authentication failure case. These 412 Precondition Failed error are super opaque for an end user.

Maybe someone who knows more about AWS authentication could help here?

trappar commented 8 months ago

I don't think wontfix is necessarily appropriate here for two reasons:

  1. This only appears upon switching to Turbo versions above 1.10.16. Why is this same setup valid on 1.10.16? Seems like there's more to the story that's worth investigating.
  2. The error handling needs to be improved.
spacedawwwg commented 8 months ago

just tried with turbo v1.13... 412 still persists :(

image

spacedawwwg commented 7 months ago

Does anybody know of alternatives to ducktors / turborepo-remote-cache that don't have this issue?

MisterJimson commented 7 months ago

I'm seeing this as well, 1.10.16 works but newer versions do not.

NullVoxPopuli commented 6 months ago

last compatible turbo version is 1.12.0, for me/us

fox1t commented 5 months ago

@NullVoxPopuli, can you please provide a repro repo? I have enough time to investigate this properly, but I can't reproduce it myself.

NullVoxPopuli commented 5 months ago

Is there a discord or something, I don't want to spam everyone while I debug. I saw a request body too large error at one point -- while I was looking at the server logs, but I don't know if that's the problem -- trying figure out some filter syntax πŸ˜…

fox1t commented 5 months ago

Yes, we have a nonpublic Discord that we set up a while ago, but I would love to have you join! https://discord.gg/PCnY8BEg

fox1t commented 5 months ago

I don't think wontfix is necessarily appropriate here for two reasons:

  1. This only appears upon switching to Turbo versions above 1.10.16. Why is this same setup valid on 1.10.16? Seems like there's more to the story that's worth investigating.
  2. The error handling needs to be improved.

You are correct in saying this. As I said, we need a repro to investigate further. I also agree with the "better error handling" part you said.

thekevinbrown commented 4 months ago

@fox1t Ok, I can't guarantee that the cause for me is the same as the cause for everyone else, but here's a minimal reproduction:

https://github.com/exogee-technology/turborepo-remote-cache-323-reproduction

Let me know if you need anything else!

thekevinbrown commented 4 months ago

Just wanted to confirm that as a workaround I've done the following in my deployment:

    // Create an access key and secret key for the service to access the bucket as a workaround for
    // https://github.com/ducktors/turborepo-remote-cache/issues/323
    const user = new User(this, 'Issue323WorkaroundUser');
    const accessKey = new CfnAccessKey(this, 'Issue323WorkaroundUserAccessKey', {
        userName: user.userName,
    });
    bucket.grantReadWrite(user);

    const service = new ApplicationLoadBalancedEc2Service(this, 'TurborepoCacheService', {
        taskImageOptions: {
            env: {
                S3_ACCESS_KEY: accessKey.ref,
                S3_SECRET_KEY: accessKey.attrSecretAccessKey,
                // etc
            },
            // etc
        },
        // etc
    });

And this works fine. So it does really seem to be "When you don't pass secret key and access key, the server is unable to assume the execution role of the task as it should by default."

fox1t commented 4 months ago

Awesome! How can we fix this directly in the app?

thekevinbrown commented 4 months ago

Personally I'd start by upgrading from aws-sdk v2 to v3. I'm not sure that'd fix it, but it might, and if it didn't then we'd have a better chance of working with AWS to figure out the root cause.

It'd also be good to catch this error more specifically and log out what's happening in a less cryptic way in this scenario.

matteovivona commented 4 months ago

Screenshot 2024-07-19 at 17 03 47

I was able to replicate it using a homelab server, k3s and minio. I assume it is something related to the s3 bucket connection

"originalError":{"message":"write EPROTO 08C2C90647750000:error:0A000410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure:../deps/openssl/openssl/ssl/record/rec_layer_s3.c:1590:SSL

mattref commented 1 month ago

I faced the 412 Precondition Failed issue with S3 as my Storage Provider recently while running the server (version 2.2.1) locally behind a proxy (http_proxy, https_proxy). In my case, the underlying error was denoted as a TimeoutError. For transparency, my monorepo project is using turbo@2.0.7.

After using the AWS CLI directly (aws s3 ls s3://bucket-name) to confirm that the issue was not with my assumed temporary credentials, I started trying to isolate the issue between Turborepo Remote Cache directly and aws-sdk.

I'll spare the extra details, but essentially I set up a separate bare-bones project and only installed the exact version of aws-sdk that is used here (i.e., npm i -E aws-sdk@2.1310.0). I then created a single script where the only function was trying to "get object" from the S3 bucket, and upon running I again was greeted with a TimeoutError.

Some quick research brought me to the AWS "Configuring Proxies for Node.js" docs page which pointed out that the some third-party HTTP agent, such as proxy-agent, package would be required.

I npm installed Turborepo Remote Cache into a separate project, then proxy-agent, and finally modified remote-cache/storage/s3.ts (so s3.js in the project's node_modules) by adding the proxy configuration before the creation of the S3 client. This resolved my issue.

import { ProxyAgent } from 'proxy-agent';
// ...
export function createS3(...) {
  // Added this before creating client
  aws.config.update({
    httpOptions: {
      agent: new ProxyAgent(process.env.https_proxy),
    },
  });
  const client = new aws.S3(...);
  // ...
}

Again, this was only an issue locally where a proxy was set. Unfortunately, something simple like NO_PROXY="amazonaws.com" would not suffice in my case as the proxy is actually required for me to communicate with AWS entirely. A solution like this would only really make sense for local development and local testing, and not for an actual app being deployed to AWS.

I'm not sure if this will help in the case of the original post, but maybe this will help others searching for 412 Precondition Failed. Also, I agree that it would be nice for aws-sdk to be upgraded to v3 at some point πŸ˜ƒ .