Chromium Fails to Launch in Serverless Environment (AWS Lambda) - Error: "The input directory '/var/task/bin' does not exist.

jacksonkasi1 commented 3 weeks ago

I am using @sparticuz/chromium with puppeteer-core in an AWS Lambda function to capture screenshots. While running in the serverless environment (AWS Lambda), Chromium fails to launch and throws the following error:

Error: The input directory "/var/task/bin" does not exist.
    at Chromium.executablePath (/var/task/src/index.js:4652:17)
    at getChrome (/var/task/src/index.js:85106:53)

The issue occurs when attempting to resolve the executable path for Chromium, as provided by @sparticuz/chromium. It seems the Lambda layer isn't correctly resolving or finding the binary in the expected location.

Steps to Reproduce

I deployed an AWS Lambda function using puppeteer-core and @sparticuz/chromium.
The Lambda function attempts to capture a screenshot of a given URL.
On invoking the Lambda function, the following error is logged:

Error: The input directory "/var/task/bin" does not exist.

Code Snippets

#!/bin/bash

# Stop the script if any command fails
set -e

# Variables
BUCKET_NAME="xxxxxxx-bucket"
LAYER_NAME="chromium"
VERSION_NUMBER="127.0.0"
S3_KEY="chromiumLayers/chromium${VERSION_NUMBER}.zip"
LAYER_FILE="chromium-layer-arn.txt"
CHROMIUM_ZIP="chromium-v${VERSION_NUMBER}.zip"
CHROMIUM_LAYER_URL="https://github.com/Sparticuz/chromium/releases/download/v${VERSION_NUMBER}/chromium-v${VERSION_NUMBER}-layer.zip"
REGION="ap-south-1"
RUNTIME="nodejs20.x"
ARCHITECTURE="x86_64"

# Check if necessary tools are installed
check_tools() {
    command -v curl >/dev/null 2>&1 || { echo >&2 "Error: curl is not installed. Please install it."; exit 1; }
    command -v aws >/dev/null 2>&1 || { echo >&2 "Error: AWS CLI is not installed. Please install it."; exit 1; }
}

# Check if the S3 bucket exists
check_s3_bucket() {
    if ! aws s3 ls "s3://${BUCKET_NAME}" >/dev/null 2>&1; then
        echo "Error: The specified S3 bucket (${BUCKET_NAME}) does not exist. Please create it first."
        exit 1
    fi
}

# Function to create the layer
create_layer() {
    check_tools
    check_s3_bucket

    # Step 1: Download or build Chromium layer
    if [ ! -f "${CHROMIUM_ZIP}" ]; then
        echo "Downloading Chromium layer..."
        curl -L -o "${CHROMIUM_ZIP}" "${CHROMIUM_LAYER_URL}" || { echo "Error: Failed to download Chromium."; exit 1; }
    fi

    # Step 2: Upload Chromium zip to S3
    echo "Uploading Chromium layer to S3..."
    aws s3 cp "${CHROMIUM_ZIP}" "s3://${BUCKET_NAME}/${S3_KEY}" --region "${REGION}" || { echo "Error: Failed to upload to S3."; exit 1; }

    # Step 3: Create the Lambda layer
    echo "Creating Lambda Layer..."
    LAYER_ARN=$(aws lambda publish-layer-version \
        --layer-name "${LAYER_NAME}" \
        --description "Chromium v${VERSION_NUMBER}" \
        --content "S3Bucket=${BUCKET_NAME},S3Key=${S3_KEY}" \
        --compatible-runtimes "${RUNTIME}" \
        --compatible-architectures "${ARCHITECTURE}" \
        --region "${REGION}" \
        --query 'LayerVersionArn' --output text) || { echo "Error: Failed to create Lambda layer."; exit 1; }

    # Step 4: Save Layer ARN to file
    echo "Saving Layer ARN to ${LAYER_FILE}..."
    echo "${LAYER_ARN}" > "${LAYER_FILE}"

    echo "Layer created successfully! Layer ARN: ${LAYER_ARN}"
}

# Function to remove the layer
remove_layer() {
    if [ ! -f "${LAYER_FILE}" ]; then
        echo "Layer ARN file not found!"
        exit 1
    fi

    LAYER_ARN=$(cat "${LAYER_FILE}")
    echo "Removing Lambda Layer with ARN: ${LAYER_ARN}..."

    # Extract layer name and version from ARN
    LAYER_NAME=$(echo "${LAYER_ARN}" | cut -d: -f7)
    LAYER_VERSION=$(echo "${LAYER_ARN}" | cut -d: -f8)

    # Delete the Lambda layer version
    aws lambda delete-layer-version --layer-name "${LAYER_NAME}" --version-number "${LAYER_VERSION}" --region "${REGION}" || { echo "Error: Failed to delete Lambda layer."; exit 1; }

    echo "Layer version ${LAYER_VERSION} deleted."

    # Remove the saved ARN file
    rm -f "${LAYER_FILE}"
    echo "Layer ARN file removed."
}

# Function to print usage
print_usage() {
    echo "Usage: $0 {create|remove}"
    echo "Commands:"
    echo "  create  - Create and publish the Chromium Lambda layer"
    echo "  remove  - Remove the Chromium Lambda layer and its ARN file"
}

# Main script execution
if [ "$#" -ne 1 ]; then
    print_usage
    exit 1
fi

if [ "$1" == "create" ]; then
    create_layer
elif [ "$1" == "remove" ]; then
    remove_layer
else
    print_usage
    exit 1
fi

// chrome-script.ts
import chromium from '@sparticuz/chromium';
import puppeteer from 'puppeteer-core';

export const getChrome = async (url: string) => {
  let executablePath = null;

  if (process.env.IS_OFFLINE === 'true') {
    const chromeExecutablePath = "C:/Program Files/Google/Chrome/Application/chrome.exe";
    executablePath = chromeExecutablePath;
    console.log("Launching Chrome locally...");
  } else {
    console.log("Launching Chrome in serverless environment...");
    executablePath = await chromium.executablePath() || '/var/task/bin/chromium';
    console.log(`🚀 Executable path: ${executablePath}`);

    if (!executablePath) {
      executablePath = '/var/task/bin/chromium';
    }
  }

  const browser = await puppeteer.launch({
    executablePath: executablePath,
    headless: true,
    args: [
      '--no-sandbox',
      '--disable-setuid-sandbox',
      '--window-size=1920,1080',
    ],
  });

  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'networkidle0' });

  return { browser, page };
};

# serverless.yml
service: auto-pdf-processor
frameworkVersion: "3"

provider:
  name: aws
  runtime: nodejs20.x
  region: ap-south-1
  memorySize: 2048
  stage: production
  timeout: 30
  environment:
    NODE_ENV: production
    VERSION: v20.0
    IS_OFFLINE: ${opt:isOffline, 'false'}

plugins:
  - serverless-esbuild
  - serverless-offline

functions:
  server:
    handler: src/index.handler
    layers:
      - arn:aws:lambda:ap-south-1:xxxx:layer:chromium:x
    events:
      - http:
          path: /
          method: ANY
          cors: true
      - http:
          path: /{proxy+}
          method: ANY
          cors: true

Environment

Node.js Version: 20.x
@sparticuz/chromium Version: 127.0.0
puppeteer-core Version: 21.5.0
AWS Lambda: Node.js 20.x runtime
Serverless Framework: v3

Expected Behavior

Chromium should launch without errors in the AWS Lambda environment, and I should be able to take screenshots of the provided URLs using puppeteer-core and @sparticuz/chromium.

Actual Behavior

Chromium fails to launch, and I get the following error:

Error: The input directory "/var/task/bin" does not exist.

Troubleshooting Steps Taken

Verified that the correct Chromium Lambda layer is being used.
Logged the output of chromium.executablePath() and found that it's resolving to null or pointing to a non-existent directory (/var/task/bin).
Tried manually setting the executable path to /var/task/bin/chromium, but the error persists.
Checked the contents of the Chromium Lambda layer to confirm that it includes the correct binary for the x86_64 architecture.

Possible Cause

It seems like the Chromium binary path is not correctly resolved in the Lambda environment, possibly due to the way the Lambda layer is set up or how chromium.executablePath() is being used in this specific environment.

Logs

2024-09-09T16:20:59.141Z 41477296-1073-459f-b2ab-98d389a31a97 INFO Launching Chrome in serverless environment...
2024-09-09T16:20:59.142Z 41477296-1073-459f-b2ab-98d389a31a97 ERROR Error capturing screenshot:  Error: The input directory "/var/task/bin" does not exist.
    at Chromium.executablePath (/var/task/src/index.js:4652:17)
    at getChrome (/var/task/src/index.js:85106:53)
    at /var/task/src/index.js:85143:26
    at dispatch (/var/task/src/index.js:72798:23)

Request for Help

Could this be a configuration issue with the Lambda layer or an incorrect way of resolving the Chromium executable path? Any guidance on how to resolve this would be greatly appreciated.

Merynek commented 3 weeks ago

same issue

jacksonkasi1 commented 3 weeks ago

Hi @Merynek, Let me know if there’s anything else you think could be worth trying in the meantime.

Merynek commented 3 weeks ago

I tried it in localhost outside of the serverless and command "chromium.executablePath()" works. But on launch puppeteer i got this error: "Failed to launch the browser process! spawn C:\Users\merav\AppData\Local\Temp\chromium ENOENT" So mby it would be same issue on serverless. So I have 2 issues :-D

jacksonkasi1 commented 3 weeks ago

Hi @Merynek, I’ve raised the same issue over at Puppeteer. You can check the details here: Puppeteer Issue #13069.

The Chromium binary is now launching, but I’m currently facing an issue with Puppeteer itself. Would appreciate any insights you might have!

DAcodedBEAT commented 3 weeks ago

Hey @jacksonkasi1 👋

I just ran into the same issue. This looks to be the case because we are bundling the code and this project doesn't support that by default (according to https://github.com/Sparticuz/chromium/blob/master/source/index.ts#L320).

I worked around this by explicitly setting where the Chromium Lambda Layer files get overlayed:

await chromium.executablePath("/opt/nodejs/node_modules/@sparticuz/chromium/bin")

jacksonkasi1 commented 2 weeks ago

Hi @DAcodedBEAT,

I followed your guidance and attempted to set the executable path to the Chromium binary within /opt/nodejs/node_modules/@sparticuz/chromium/bin. Below are the key steps I took and the corresponding errors:

Code:

// @ts-ignore
import chromium from "@sparticuz/chromium";
import puppeteer from "puppeteer-core";
import { readdirSync } from "fs";

export const getChrome = async (url: string) => {
  let browser = null;

  try {
    console.log("Launching Chrome in serverless environment...");

    // Log available directories
    const availableDirs = readdirSync("/opt/nodejs/node_modules/@sparticuz/chromium");
    console.log("Available directories in /opt/nodejs/node_modules/@sparticuz/chromium:", availableDirs);

    // List files in bin directory
    const binFiles = readdirSync("/opt/nodejs/node_modules/@sparticuz/chromium/bin");
    console.log("Files in /opt/nodejs/node_modules/@sparticuz/chromium/bin:", binFiles);

    // Set executable path to Chromium
    const executablePath = "/opt/nodejs/node_modules/@sparticuz/chromium/bin";
    console.log(`🚀 Executable path: ${executablePath}`);

    browser = await puppeteer.launch({
      executablePath,
      headless: true,
      args: chromium.args,
    });

    const page = await browser.newPage();
    await page.goto(url, { waitUntil: 'networkidle0' });

    return { browser, page };
  } catch (error) {
    console.error("Error launching Chrome:", error);
    throw error;
  } finally {
    if (browser) await browser.close();
  }
};

Errors Encountered:

First Attempt (setting path to /opt/nodejs/node_modules/@sparticuz/chromium/bin/chromium):
```
ERROR Error launching Chrome: Error: Browser was not found at the configured executablePath (/opt/nodejs/node_modules/@sparticuz/chromium/bin/chromium)
```
It seems the Chromium binary was not found at the specified path.
Second Attempt (setting path to /opt/nodejs/node_modules/@sparticuz/chromium/bin):
```
ERROR Error launching Chrome: Error: Failed to launch the browser process! spawn /opt/nodejs/node_modules/@sparticuz/chromium/bin EACCES
```
This indicates a permission issue (EACCES), likely due to the binary needing to be extracted or inflated before it can be used.

Debug Logs:

Here’s what I found during logging:

INFO Available directories in /opt/nodejs/node_modules/@sparticuz/chromium: [ 'bin', 'build', 'package.json' ]
INFO Files in /opt/nodejs/node_modules/@sparticuz/chromium/bin: [ 'al2.tar.br', 'al2023.tar.br', 'chromium.br', 'fonts.tar.br', 'swiftshader.tar.br' ]

It seems the binary is still in compressed format (chromium.br), which may explain why it’s not launching properly.

Would appreciate any advice on how to properly extract the .br files or inflate them for use in Lambda. Thanks!

Sparticuz commented 2 weeks ago

You need to await chromium.executablePath(LOCATION_OF_BR_FILE) instead of calling the path in puppeteer.

jacksonkasi1 commented 2 weeks ago

Hi @Sparticuz,

Thanks for the guidance! I updated the code as per your suggestion by using await chromium.executablePath(executablePath) instead of directly passing the path. Here's the updated snippet:

// Set the executable path to the chromium binary inside the bin directory
const executablePath = "/opt/nodejs/node_modules/@sparticuz/chromium/bin";
console.log(`🚀 Executable path: ${executablePath}`);

browser = await puppeteer.launch({
  executablePath: await chromium.executablePath(executablePath),
  headless: true,
  args: chromium.args,
});

const page = await browser.newPage();
await page.goto(url, { waitUntil: 'networkidle0' });

However, I'm still encountering the same issue that I previously raised on Puppeteer (Puppeteer Issue #13069).

Logs:

INFO Files in /opt/nodejs/node_modules/@sparticuz/chromium/bin: [ 'al2.tar.br', 'al2023.tar.br', 'chromium.br', 'fonts.tar.br', 'swiftshader.tar.br' ]
INFO 🚀 Executable path: /opt/nodejs/node_modules/@sparticuz/chromium/bin
ERROR Error capturing screenshot:  TargetCloseError: Protocol error (Page.bringToFront): Session closed. Most likely the page has been closed.
ERROR Error: Protocol error: Connection closed. Most likely the page has been closed.

It seems that Puppeteer is not able to keep the page open and the session is closing prematurely. The Chromium binary is there in .br format, but the page brings up the error TargetCloseError: Session closed. Most likely the page has been closed.

I would appreciate any further guidance on how I can resolve this.

Thanks!

jacksonkasi1 commented 2 weeks ago

Also, I have attached a log output CSV file which shows that all logs were captured. log-events-viewer-result.csv

jacksonkasi1 commented 2 weeks ago

Hey @Sparticuz, @DAcodedBEAT, and @Merynek 👋,

Thank you all for your help and time. I’ve resolved the issue – it turned out to be a configuration mistake on my end, and there’s no problem with the library itself.

For anyone who may face a similar issue, I’ve set up a simple code configuration to use Chrome in a Lambda serverless environment. I’ve also added the necessary steps and included a script to automate setting up the Chromium Lambda Layer.

You can check out the full setup here: aws-lambda-chromium-screenshot-api.

Hopefully, this will be helpful for others facing the same challenge.

I'll go ahead and close this issue. Thanks again!

Sparticuz / chromium