Niels-IO / next-image-export-optimizer

Use Next.js advanced <Image/> component with the static export functionality. Optimizes all static images in an additional step after the Next.js static export.
413 stars 51 forks source link

Remote image download error: `ENAMETOOLONG: name too long` #216

Closed beschler closed 1 week ago

beschler commented 1 month ago

Hi there! Very much appreciate your work on this package.

I'm running into an issue using this package to download remote images from an AWS S3 bucket using pre-signed URLs. My S3 bucket is private, and my goal is to download the remote images from AWS at build time as static assets for my static Next.js website.

Here is the contents of my remoteOptimizedImages.js file:

require("dotenv").config();
const { Pool } = require("pg");
const { S3, GetObjectCommand } = require("@aws-sdk/client-s3");
const { getSignedUrl } = require("@aws-sdk/s3-request-presigner");

const dbconnect = {
  host: process.env.DB_HOST,
  port: process.env.DB_PORT,
  user: process.env.DB_USER,
  database: process.env.DB_NAME,
};

const db = new Pool(dbconnect);

async function getStaticImages() {
  // get all photos from PostgreSQL database
  const client = await db.connect();
  const all_photos = await client.query(`SELECT filename FROM photos`);
  client.release();

  // connect to AWS S3 bucket
  const s3 = new S3({
    region: process.env.AWS_S3_REGION,
    credentials: {
      accessKeyId: process.env.AWS_ACCESS_KEY_ID,
      secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
    },
  });

  let photos = [];

  for (const row of all_photos.rows) {
    // get photo object from AWS S3 bucket
    const getObject = new GetObjectCommand({
      Bucket: process.env.NEXT_PUBLIC_AWS_S3_BUCKET,
      Key: row.filename,
    });

    // get pre-signed URL for AWS S3 bucket object
    const signedUrl = await getSignedUrl(s3, getObject, {
      expiresIn: 3600,
    });

    photos.push(signedUrl);
  }

  console.log(photos);

  return photos;
}

module.exports = getStaticImages();

However, I'm receiving an error stating the file name of the remote image is too long:

---- next-image-export-optimizer: Begin with optimization... ---- 
Found 62 remote images...
Error: Unable to save /Users/beschler/git/MY_REPO/remoteImagesForOptimization/MY_AWS_S3_BKT.s3.us-west-1.amazonaws.com_3af9282b9cac.webp_X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIA4MTWG2XAGGQA57F5_2F20240515_2Fus-west-1_2Fs3_2Faws4_request&X-Amz-Date=20240515T033806Z&X-Amz-Expires=3600&X-Amz-Signature=89aa0609baee3266e81d5103c92a23f8cb7eb1f6859d7391919efdd2fc2152f1&X-Amz-SignedHeaders=host&x-id=GetObject.jpg (ENAMETOOLONG: name too long, open '/Users/beschler/git/MY_REPO/remoteImagesForOptimization/MY_AWS_S3_BKT.s3.us-west-1.amazonaws.com_3af9282b9cac.webp_X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIA4MTWG2XAGGQA57F5_2F20240515_2Fus-west-1_2Fs3_2Faws4_request&X-Amz-Date=20240515T033806Z&X-Amz-Expires=3600&X-Amz-Signature=89aa0609baee3266e81d5103c92a23f8cb7eb1f6859d7391919efdd2fc2152f1&X-Amz-SignedHeaders=host&x-id=GetObject.jpg').
Error: Unable to download remote images (ENAMETOOLONG: name too long, open '/Users/beschler/git/MY_REPO/remoteImagesForOptimization/MY_AWS_S3_BKT.s3.us-west-1.amazonaws.com_3af9282b9cac.webp_X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIA4MTWG2XAGGQA57F5_2F20240515_2Fus-west-1_2Fs3_2Faws4_request&X-Amz-Date=20240515T033806Z&X-Amz-Expires=3600&X-Amz-Signature=89aa0609baee3266e81d5103c92a23f8cb7eb1f6859d7391919efdd2fc2152f1&X-Amz-SignedHeaders=host&x-id=GetObject.jpg').
node:internal/process/promises:289
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[Error: ENAMETOOLONG: name too long, open '/Users/beschler/git/MY_REPO/remoteImagesForOptimization/MY_AWS_S3_BKT.s3.us-west-1.amazonaws.com_3af9282b9cac.webp_X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIA4MTWG2XAGGQA57F5_2F20240515_2Fus-west-1_2Fs3_2Faws4_request&X-Amz-Date=20240515T033806Z&X-Amz-Expires=3600&X-Amz-Signature=89aa0609baee3266e81d5103c92a23f8cb7eb1f6859d7391919efdd2fc2152f1&X-Amz-SignedHeaders=host&x-id=GetObject.jpg'] {
  errno: -63,
  code: 'ENAMETOOLONG',
  syscall: 'open',
  path: '/Users/beschler/git/MY_REPO/remoteImagesForOptimization/MY_AWS_S3_BKT.s3.us-west-1.amazonaws.com_3af9282b9cac.webp_X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIA4MTWG2XAGGQA57F5_2F20240515_2Fus-west-1_2Fs3_2Faws4_request&X-Amz-Date=20240515T033806Z&X-Amz-Expires=3600&X-Amz-Signature=89aa0609baee3266e81d5103c92a23f8cb7eb1f6859d7391919efdd2fc2152f1&X-Amz-SignedHeaders=host&x-id=GetObject.jpg'
}

Node.js v20.11.1

I'm not able to modify the names of the files of the remote image URLs, as they're generated automatically from AWS. If it's helpful, here's an example of an image file URL generated by the @aws-sdk/s3-request-presigner package (that I'm passing into an array in remoteOptimizedImages.js):

https://MY_AWS_S3_BKT.s3.us-west-1.amazonaws.com/IMG_FILENAME.webp?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=AKIA4MTWG2XAGGQA57F5%2F20240515%2Fus-west-1%2Fs3%2Faws4_request&X-Amz-Date=20240515T040052Z&X-Amz-Expires=3600&X-Amz-Signature=5bb7c6b44b78482dbe9a5b5bc69b32b90f3d4552193401aa336bf7bbfcac2ffa&X-Amz-SignedHeaders=host&x-id=GetObject

Is there some way to modify the downloaded names of these images, or truncate them if they're longer than Node or my file system can handle?

Thanks in advance for your help!


System information:

Device: MacBook Pro 16-inch, 2023, M2 Pro OS: macOS Ventura 13.6.6 Browser: Google Chrome v124.0.6367.201 Node.js Version: 20.11.1 Next.js Version: 14.2.2 next-image-export-optimizer Version: 1.12.3

beschler commented 1 month ago

Hey there - quick update:

I did some digging into the code for this package, and wrote a very simple workaround I was wondering if you'd review.

In file utils/urlToFilename.ts:

module.exports = function urlToFilename(url: string) {
  // Remove the protocol from the URL
  // let filename = url.replace(/^(https?|ftp):\/\//, "");

  // CHANGES START...
  // Strip URL parameters from URL
  let filename = url.split(`?`)[0];

  // Isolate filename from URL
  filename = filename.split(`/`).pop();

  // Prepend timestamp to unique filename
  let timestamp = new Date().getTime();
  filename = timestamp + `-` + filename;
  // ...CHANGES END

  // Replace special characters with underscores
  filename = filename.replace(/[/\\:*?"<>|#%]/g, "_");

  // Remove control characters
  // eslint-disable-next-line no-control-regex
  filename = filename.replace(/[\x00-\x1F\x7F]/g, "");

  // Trim any leading or trailing spaces
  filename = filename.trim();

  return filename;
};

Essentially, instead of leveraging the entire URL (including URL parameters) as a filename (which in my case is enormously long), it strips all URL parameters, and only keeps the content after the last path delimiter /. It then adds a timestamp string to unique filenames that may be identical (which may be unnecessary).

(You could even potentially go one step further to ensure the maximum length of the filename string is no greater than 100 characters, for example.)

Perhaps this could be instituted with some kind of setting in next.config.js, e.g. nextImageExportOptimizer_truncateRemoteFilenames: true?

As a (very) temporary solution to get my project working, I have tested this code by modifying /node_modules/next-image-export-optimizer/dist/utils/urlToFilename.js, and it works! I'm no longer getting the error described, and my project works like a charm.

Please let me know your thoughts - again, this package is a life saver šŸ™šŸ» Thanks so much for your work on this.

Niels-IO commented 3 weeks ago

Hi @beschler,

I am worried about uniqueness. In essence, I want to a sort of hash of the filename, but without the headache that comes with the web crypto api.

I am thinking of implementing this hashing algorithm that then replaces the URL: https://stackoverflow.com/a/52171480

Your solution is using a timestamp at build time. How can the front end (client) generate the same timestamp to reference these filenames?

Niels-IO commented 3 weeks ago

Hi @beschler,

Could you please test the hashing function in v1.14.0 I just published? It should get rid of the problem altogether by using a hash value.