samvera / node-iiif

This module provides a full-featured IIIF Image API 2.1 and 3.0 image processor. It covers only the image processing pipeline itself, leaving input and output to the caller.
Apache License 2.0
27 stars 5 forks source link

Question - execute() #25

Closed codeclout closed 1 year ago

codeclout commented 1 year ago

Hello,

We have recently run into an issue where the (processor).execute function terminates requests without a result or error. This is causing the bypassing of downstream functionality (e.g. uploading images larger than 6MB to an S3 cache bucket in the serverless IIIF repo) .

If you could provide some scenarios as to why this would happen and/or how to resolve, it would be highly appreciated.

Thanks, Brian

mbklein commented 1 year ago

There are a few possibilities. The most common would be a hard exit from running out of memory during processing. Do you have any log output from the call?

codeclout commented 1 year ago

@mbklein - thank you. log output is below

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|                                                                                  message                                                                                  |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| INIT_START Runtime Version: nodejs:16.v11 Runtime Version ARN: arn:aws:lambda:us-east-1::runtime:ee071345fbee0f4751fcbcd47a5784052d8f7d4256ca7f31e6056b294cee515c         |
| START RequestId: 0dfb090c-cfb9-424b-98e5-5a8c65b67ae1 Version: $LATEST                                                                                                    |
| END RequestId: 0dfb090c-cfb9-424b-98e5-5a8c65b67ae1                                                                                                                       |
| REPORT RequestId: 0dfb090c-cfb9-424b-98e5-5a8c65b67ae1 Duration: 6.95 ms Billed Duration: 7 ms Memory Size: 3008 MB Max Memory Used: 101 MB Init Duration: 693.56 ms      |
| START RequestId: 7aa49198-0378-43f9-9495-76e403fea456 Version: $LATEST                                                                                                    |
| END RequestId: 7aa49198-0378-43f9-9495-76e403fea456                                                                                                                       |
| REPORT RequestId: 7aa49198-0378-43f9-9495-76e403fea456 Duration: 2.12 ms Billed Duration: 3 ms Memory Size: 3008 MB Max Memory Used: 101 MB                               |
| START RequestId: 743a33f4-2b0a-4596-80e7-0525eed6b90a Version: $LATEST                                                                                                    |
| 2023-01-27T14:26:14.173Z 743a33f4-2b0a-4596-80e7-0525eed6b90a WARN Unable to get dimensions for nlm:nlmuid-101416076-img using custom function. Falling back to probe().  |
| 2023-01-27T14:26:24.096Z 743a33f4-2b0a-4596-80e7-0525eed6b90a Task timed out after 10.03 seconds                                                                          |
| END RequestId: 743a33f4-2b0a-4596-80e7-0525eed6b90a                                                                                                                       |
| REPORT RequestId: 743a33f4-2b0a-4596-80e7-0525eed6b90a Duration: 10026.46 ms Billed Duration: 10000 ms Memory Size: 3008 MB Max Memory Used: 454 MB                       |
| INIT_START Runtime Version: nodejs:16.v11 Runtime Version ARN: arn:aws:lambda:us-east-1::runtime:ee071345fbee0f4751fcbcd47a5784052d8f7d4256ca7f31e6056b294cee515c         |
mbklein commented 1 year ago

The issue is right before the end of the log:

Task timed out after 10.03 seconds

There seems to be something about that image in particular that's making it take far too long to retrieve the image dimensions, and the lambda function is timing out after 10 seconds. You can try raising the timeout (via the IiifTimeoutValue deploy parameter), but 10 seconds is a pretty long time for an operation like this. It's more likely you're going to want to do something about the dimensions.

There are two “best practices” type options you should consider.

Object Metadata

The probe() function used for retrieving image dimensions on the fly is notoriously slow and it has caused me and other users a lot of trouble in the past. For that reason, serverless-iiif will attempt to read predetermined dimensions from S3 object metadata. (I could have sworn this functionality had better documentation, but I can't find it at the moment.)

The resolver checks the S3 object for metadata properties called width and height, and uses them to bypass the probe() if present.

Here's what the metadata looks like from the AWS console:

image

And here's what the same object's output looks like from aws s3api head-object:

{
    "AcceptRanges": "bytes",
    "LastModified": "2022-03-23T14:55:13+00:00",
    "ContentLength": 6437649,
    "ETag": "\"1b4c7cda130f001d9899fee0ca9f9f81\"",
    "ContentType": "application/octet-stream",
    "Metadata": {
        "width": "9102",
        "height": "6802"
    },
    "StorageClass": "INTELLIGENT_TIERING"
}

(Note that updating the metadata creates a copy of the object in place, so if your images are in a bucket that has object versioning turned on, updating the metadata will create a new version.)

Viewer Request Function

The other option is to use a viewer-request function to add an x-preflight-dimensions header to the request. I believe we also talked about viewer-request functions in samvera/serverless-iiif#92.

codeclout commented 1 year ago

Hi @mbklein, Thank you for your response.

We are using HTTP as the source. I have updated the resolvers.js file of the serverless-iiif repo with the following to implement the dimensionResolver using the probe-image-size module from the included layer in the serverless-iiif repo.

const fetch = (...args) => import('node-fetch').then(({default: fetch}) => fetch(...args));
const probeImage = require("probe-image-size");

// Create input stream from http location
const httpStream = async (location, callback) => {
  const result = await fetch(location);

  if (!result.ok) throw new Error(`unexpected response ${result.statusText}`);
  return await callback(result.body);
};

const defaultHttpLocation = (id) => {
  return `${process.env.HTTP_SOURCE}/${id}`;
}

const httpDimensionRetriever = async (location) => {
  const d = await probeImage(location);

  return {width: d.width, height: d.height};
};

const standardResolver = () => {
  if (`${process.env.USE_HTTP_SOURCE}`.toLowerCase() === 'true') {
    return {
      streamResolver: async ({ id, baseUrl }, callback) => {
        return httpStream(defaultHttpLocation(id), callback);
      },
      dimensionResolver: async ({ id, baseUrl }, callback) => {
        httpDimensionRetriever(defaultHttpLocation(id), callback);
      }
    }
  }

  return {
    streamResolver: async ({id, baseUrl}, callback) => {
      return s3Stream(defaultStreamLocation(id), callback);
    },
    dimensionResolver: async ({id, baseUrl}) => {
      return dimensionRetriever(defaultStreamLocation(id));
    }
  };
};

I have wrapped the execute function in console logs... and logs after the execute() do not run, even after supplying the image dimensions... meaning we no longer see the warning that the probe function is running.

if (cached) {
    response = forceFailover();
  } else {
    console.log('about to run execute() from node-iiif-processor module');
    const result = await resource.execute();

    console.log('after execute -> is the image too large', isTooLarge(result.body));
    if (isTooLarge(result.body)) {
      await makeCache(key, result);
      response = forceFailover();
    } else {
      response = makeResponse(result);
    }
  }

These are both snippets from the serverless-iiif repo, but I hope the snippets can add context to the issue we are experiencing. Please also note we are still receiving a timeout after 10 seconds.

Thanks in advance, Brian

mbklein commented 1 year ago

@codeclout I'm very interested in taking a look at this, but I'm up against a release deadline, so I'm going to have to put it off for at least a week. I'll make a note to check back as soon as I can.

codeclout commented 1 year ago

Hi @mbklein

Just following up with a few observations.

  1. The issues noted in the previous posts are only happening when the payload for the Lambda is larger than the 6 MB limit. I am seeing the output of the execute function when the payload is within bounds.
2023-02-20T18:16:38.821Z    92cc59c4-0ce0-4987-bf6b-009213ce13ed    INFO    execute {
  contentType: 'image/jpeg',
  body: <Buffer ff d8 ff db 00 43 00 06 04 05 06 05 04 06 06 05 06 07 07 06 08 0a 10 0a 0a 09 09 0a 14 0e 0f 0c 10 17 14 18 18 17 14 16 16 1a 1d 25 1f 1a 1b 23 1c 16 ... 285054 more bytes>
}
  1. With that information I increased the highwatermark of the stream, but that did not fix the issue.
// Create input stream from http location
const httpStream = async (location, callback) => {
  const result = await fetch(location, {
    highWaterMark: 1024 * 1024
});
  1. I continued to investigate, but ran into a black box. Essentially, I have not located where the toBuffer method is defined. https://github.com/samvera/node-iiif/blob/06f851913199008b83687d18db086cad6bf55b1a/index.js#L162

I will continue to investigate and look forward to your reply.

Thanks, Brian

mbklein commented 1 year ago

The toBuffer method is part of the sharp API. pipeline is the pre-constructed Sharp image transformer. Piping the image stream into it returns a Sharp image object, which then gets converted to a Buffer to return to the requestor.

I'd be curious to see what would happen if you used axios in streaming mode instead of node-fetch:

const axios = require("axios");

const httpStream = async (location, callback) => {
  const { data } = await axios(location, { responseType: "stream" });
  return data;
});
codeclout commented 1 year ago

@mbklein - thank you

After quite a bit of investigation, it seems that some, not all requests are resulting in a 403 being returned by the layer and the errors are not being propagated up, resulting in nothing being returned by the execute function. As an interim fix I have moved the iiif-processor out of the layer, higher in the dependency chain. All is working as expected now.

In an effort not to stray too far away from the original implementation is there a better approach than the sledge hammer I implemented? We are using version 1 of the layer, the issue is intermittent and everything works as expected now that the processor is not being called from the layer.

Thanks in Advance, Brian

mbklein commented 1 year ago

If your approach is working for you, I think it's fine. There are a couple outstanding issues with the layer (they say v3 but I wouldn't be surprised if the issues exist in others as well); I just haven't had time to get back and look into them yet.

codeclout commented 1 year ago

@mbklein - thanks for all of your help