Open trappar opened 8 months ago
we've been having the same issue using Azure. Locked in at turbo v1.10 for now until we have time to fully investigate
Super weird. We're using a remote-cache server and Turbo 1.12.5 in other projects, and so far, we haven't had any problems at all
I've found what is causing this in my particular case.
I have two different workflows. I was only seeing this issue appear in one of them.
Both of them utilize my GH Action to start a cache server like this:
- uses: trappar/turborepo-remote-cache-gh-action@v2
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
with:
storage-provider: s3
storage-path: turborepo-cache
However, the one that was failing had the following proceeding it:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::${{ secrets.ACCOUNT_ID }}:role/[REDACTED]
aws-region: us-east-1
If I simply switch the order of these so that the remote cache server starts before configuring AWS credentials, then the error disappears.
So this may or may not be a bug depending on which credentials should take precedence. I assumed that the AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
env variables would take precedence over anything else, but that is clearly incorrect. I'm not actually 100% sure what the output of that aws action is (does it create an ~/.aws/credentials
file or something?), but it looks like it's taking over, and those credentials don't have permission to view the S3 bucket I'm telling the cache server to use.
Regardless of if this app is doing something wrong or not, it does seem like there's room to improve the handling of this authentication failure case. These 412 Precondition Failed
error are super opaque for an end user.
Maybe someone who knows more about AWS authentication could help here?
I don't think wontfix
is necessarily appropriate here for two reasons:
just tried with turbo v1.13... 412 still persists :(
Does anybody know of alternatives to ducktors / turborepo-remote-cache that don't have this issue?
I'm seeing this as well, 1.10.16 works but newer versions do not.
last compatible turbo version is 1.12.0, for me/us
@NullVoxPopuli, can you please provide a repro repo? I have enough time to investigate this properly, but I can't reproduce it myself.
Is there a discord or something, I don't want to spam everyone while I debug. I saw a request body too large error at one point -- while I was looking at the server logs, but I don't know if that's the problem -- trying figure out some filter syntax π
Yes, we have a nonpublic Discord that we set up a while ago, but I would love to have you join! https://discord.gg/PCnY8BEg
I don't think
wontfix
is necessarily appropriate here for two reasons:
- This only appears upon switching to Turbo versions above 1.10.16. Why is this same setup valid on 1.10.16? Seems like there's more to the story that's worth investigating.
- The error handling needs to be improved.
You are correct in saying this. As I said, we need a repro to investigate further. I also agree with the "better error handling" part you said.
@fox1t Ok, I can't guarantee that the cause for me is the same as the cause for everyone else, but here's a minimal reproduction:
https://github.com/exogee-technology/turborepo-remote-cache-323-reproduction
Let me know if you need anything else!
Just wanted to confirm that as a workaround I've done the following in my deployment:
// Create an access key and secret key for the service to access the bucket as a workaround for
// https://github.com/ducktors/turborepo-remote-cache/issues/323
const user = new User(this, 'Issue323WorkaroundUser');
const accessKey = new CfnAccessKey(this, 'Issue323WorkaroundUserAccessKey', {
userName: user.userName,
});
bucket.grantReadWrite(user);
const service = new ApplicationLoadBalancedEc2Service(this, 'TurborepoCacheService', {
taskImageOptions: {
env: {
S3_ACCESS_KEY: accessKey.ref,
S3_SECRET_KEY: accessKey.attrSecretAccessKey,
// etc
},
// etc
},
// etc
});
And this works fine. So it does really seem to be "When you don't pass secret key and access key, the server is unable to assume the execution role of the task as it should by default."
Awesome! How can we fix this directly in the app?
Personally I'd start by upgrading from aws-sdk
v2 to v3. I'm not sure that'd fix it, but it might, and if it didn't then we'd have a better chance of working with AWS to figure out the root cause.
It'd also be good to catch this error more specifically and log out what's happening in a less cryptic way in this scenario.
I was able to replicate it using a homelab server, k3s and minio. I assume it is something related to the s3 bucket connection
"originalError":{"message":"write EPROTO 08C2C90647750000:error:0A000410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure:../deps/openssl/openssl/ssl/record/rec_layer_s3.c:1590:SSL
I faced the 412 Precondition Failed
issue with S3
as my Storage Provider recently while running the server (version 2.2.1
) locally behind a proxy (http_proxy
, https_proxy
). In my case, the underlying error was denoted as a TimeoutError
. For transparency, my monorepo project is using turbo@2.0.7
.
After using the AWS CLI directly (aws s3 ls s3://bucket-name
) to confirm that the issue was not with my assumed temporary credentials, I started trying to isolate the issue between Turborepo Remote Cache directly and aws-sdk.
I'll spare the extra details, but essentially I set up a separate bare-bones project and only installed the exact version of aws-sdk that is used here (i.e., npm i -E aws-sdk@2.1310.0
). I then created a single script where the only function was trying to "get object" from the S3 bucket, and upon running I again was greeted with a TimeoutError
.
Some quick research brought me to the AWS "Configuring Proxies for Node.js" docs page which pointed out that the some third-party HTTP agent, such as proxy-agent
, package would be required.
I npm installed Turborepo Remote Cache into a separate project, then proxy-agent, and finally modified remote-cache/storage/s3.ts (so s3.js
in the project's node_modules) by adding the proxy configuration before the creation of the S3 client. This resolved my issue.
import { ProxyAgent } from 'proxy-agent';
// ...
export function createS3(...) {
// Added this before creating client
aws.config.update({
httpOptions: {
agent: new ProxyAgent(process.env.https_proxy),
},
});
const client = new aws.S3(...);
// ...
}
Again, this was only an issue locally where a proxy was set. Unfortunately, something simple like NO_PROXY="amazonaws.com"
would not suffice in my case as the proxy is actually required for me to communicate with AWS entirely. A solution like this would only really make sense for local development and local testing, and not for an actual app being deployed to AWS.
I'm not sure if this will help in the case of the original post, but maybe this will help others searching for 412 Precondition Failed
. Also, I agree that it would be nice for aws-sdk to be upgraded to v3 at some point π .
π Bug Report
I updated Turbo from
1.10.16
to1.12.5
today and started seeing this in CI:There is some discussion around this in this turbo issue, where people mention that this is likely due to using this remote cache server along with S3 specifically.
To Reproduce
I doubt that it will be possible for me to create reproduction instructions / repo for this issue considering that others have failed to reliably reproduce this in the thread above.
Expected behavior
To not get the http status errors.
Your Environment
trappar/turborepo-remote-cache-gh-action@v2
, which is a new version I've been working on in order to support the up-to-date version of this package.1.12.5
ubuntu-latest