samvera / serverless-iiif

IIIF Image API 2.1 & 3.0 server in an AWS Serverless Application
https://samvera.github.io/serverless-iiif/
Apache License 2.0
69 stars 21 forks source link

Error: Input image exceeds pixel limit #135

Closed t4k closed 7 months ago

t4k commented 9 months ago

[Original title: still seem to hit a 6 MB limit]

I have finally gotten around to trying out the 5.x version. Having followed the development fairly closely, I'm under the impression that the 6 MB limit should not be an issue anymore (see README commit).

I just spun up this stack today and dropped some images in my bucket to try it out and I'm having the issue of Error: Input image exceeds pixel limit.

Working 351 KB: image info Broken 13.4 MB: image info

Am I under the wrong impression? Are there any logs that would be helpful?

Thank you for all the work, btw!

mbklein commented 9 months ago

The 6MB limit isn't about your source images; it's about the size of the response. For example, our source bucket contains pyramid TIFFs that are tens of megabytes in size. Our largest is 86MB. If I request it at /full/full/0/default.jpg, the response is 69MB, but it works. You'd never run into the 6MB limit requesting an info.json because the response will never be more than 1 or 2KB.

The issue I think you're running into is that your full size source image has more total pixels in it than the underlying image processing library can handle, so it's failing to even load it. It's not a limit on the width or the height, but a total WxH limit of 268402689 pixels. The thing that's confusing me about it is that this constructor option is supposed to override that limit. It seems not to be working.

Would you be able to share the broken source image with me so that I can do some troubleshooting with it? And can you confirm that you're using the current version of serverless-iiif? (v5.0.1) That will help me understand exactly which versions of the dependencies are deployed.

That last question has also giving me the idea to create some kind of diagnostic output for the lambda, but without compromising security. I'll have to mull that one over a bit.

t4k commented 9 months ago

Very clarifying! Thank you! I didn't understand there was a total pixel limit (even though the error message says exactly that).

I can confirm that I'm using the v5.0.1 stack. I see serverlessrepo:semanticVersion 5.0.1 in the tags.

1.10-22.ptif.zip I had to zip the file because of GitHub limitations. It is a pyramid TIFF and I use the .ptif extension to help me keep track of that. I hadn't actually looked at the dimensions of the image, and this one is ridiculously large for not being a map of some kind (19149 × 19469 pixels).

If it comes down to it I can probably do some processing of our access images so that they conform to a limit, but I'm hoping it can work smoothly without that.

mbklein commented 9 months ago

Ahhh, I see it now. I had limitInputPixels here but not here. I've fixed it now and pushed node-iiif v4.0.3 and serverless-iiif v5.0.2, so you should be good to go if you update your stack.

That said, the reason you ran into this in the first place (and I'm glad you did, because it surfaced the bug) is that serverless-iiif has to read the image metadata on the fly at runtime. I have just noticed that the docs are missing this next bit of information, but if you add width and height (and, optionally, pages, indicating the number of pre-sized resolutions within your ptiff) metadata to your S3 objects, serverless-iiif will use that instead and save a lot of runtime overhead.

t4k commented 9 months ago

Excellent! I can add metadata. The metadata headers that would be used are formatted like these?

x-amz-meta-width, x-amz-meta-height, and x-amz-meta-pages

mbklein commented 9 months ago

If you're using the S3 API directly, yes, those are the headers. If you're using the AWS console, CLI, or any of the official SDKs, you just use the names (width, height, pages) and they'll turn them into the proper headers for you. Unfortunately, it's not possible to only update the metadata on an existing object – you have to copy the object onto itself with the new metadata. But once you make it part of your workflow, it should be pretty smooth.