ducktors / turborepo-remote-cache

Open source implementation of the Turborepo custom remote cache server.
https://ducktors.github.io/turborepo-remote-cache/
MIT License
1.03k stars 95 forks source link

Guidance for running as AWS Lambda #28

Closed paambaati closed 2 years ago

paambaati commented 2 years ago

Looking at the implementation, it does feel like it can be deployed anywhere, however, the S3 read/writes might be tricky – especially for large files, as Lambdas are essentially short-lived.

Do you folks have any guidance/tips on how to effectively run this on AWS Lambda?

fox1t commented 2 years ago

Hi. Lambda is something that we didn't explore yet. It would be nice to provide a guide, though.

dobesv commented 2 years ago

It's probably nicer to just run this locally on each developer's machine and proxy to S3. That way you don't have to worry as much about security vulnerabilities in the server code and compute costs.

paambaati commented 2 years ago

I've managed to run this on Lambda with a minimal set of changes without issues, so closing this.

paambaati commented 2 years ago

It's probably nicer to just run this locally on each developer's machine and proxy to S3. That way you don't have to worry as much about security vulnerabilities in the server code and compute costs.

Yes, but its hard to do this across teams – difficult to enforce a long-running process as a prerequisite for most dev environments, especially when most of them prefer their own systems/OSes/platforms/workflows.

dobesv commented 2 years ago

Ah, we have a docker-compose that runs services that people need for dev (mongodb, redis, and this), so it works for us anyway.

fox1t commented 2 years ago

I've managed to run this on Lambda with a minimal set of changes without issues, so closing this.

Would you mind share the steps with us so we can add them to the README?

cpitt commented 2 years ago

I got it to run on lambda/apigw and I can hit it the api just fine but for some reason the turbo client doesn't love it. I don't get any errors but the turbo client gives me cache misses despite being able to curl and get the results. It does however push the resources to s3 🤷‍♂️

More investigation is needed

Here's the handler code

import awsLambdaFastify from '@fastify/aws-lambda'
import { createApp } from "turborepo-remote-cache/build/app";

const app = createApp({
  trustProxy: true,
})
const proxy = awsLambdaFastify(app)
export const handler = proxy

For now I just wrote a script that starts the server and executes turbo locally

#!/usr/bin/env node

const { spawn } = require('child_process');
const { generateBinPath } = require('turbo/node-platform');
const AWS = require('aws-sdk');

async function setConfig() {
  process.env.TURBO_TOKEN = 'SUPER_SECRET_VARIABLE';
  process.env.TURBO_API = 'http://localhost:3000';
  process.env.TURBO_TEAMID = 'team_ramrod;
  process.env.STORAGE_PROVIDER = 's3';
  process.env.STORAGE_PATH = 'fluent-turborepo-cache';
}

async function checkAwsConnection() {
  var sts = new AWS.STS();
  await sts.getCallerIdentity({}).promise();
}

async function exec() {
  let useRemoteCache = false;
  try {
    await checkAwsConnection();
    console.log('AWS Connection Successful using remote cache');
    useRemoteCache = true;
    await setConfig();
  } catch (e) {
    console.log('AWS connection could not be established using local cache');
  }

  if (useRemoteCache) {
    const { createApp } = require('turborepo-remote-cache/build/app');
    const turboCacheServer = createApp({ trustProxy: true, logger: undefined });
    try {
      await turboCacheServer.listen(3000, '127.0.0.1');
    } catch (err) {
      turboCacheServer.log.error(err);
      process.exit(1);
    }
  }

  const turbo = spawn(generateBinPath(), process.argv.slice(2), { stdio: 'inherit' });

  turbo.on('close', (code) => {
    process.exit(code);
  });
}

exec();
dobesv commented 2 years ago

If you pass the option -vv to turbo when you run it, you can get some extra debug logs that might help.

cpitt commented 2 years ago

@dobesv Yeah, -vv and -vvv show the GET and PUT operations working but it ultimately ends up in a cache miss 🤷‍♂️ .

I can see the items in s3. I can hit the lambda endpoint and GET them... but the turbo cache client always determines it a cache miss...

Same s3 bucket with a local server works no problem

dobesv commented 2 years ago

Hmm strange, must be something about the response that it is getting that isn't right. Maybe you could run wireshark or use a logging HTTP proxy and take a look at the actual HTTP responses and see if there's something weird going on there. Maybe your lambda is not returning things in the format you think.