tnc-ca-geo / animl-ml

Machine Learning resources for camera trap data processing
Other
4 stars 1 forks source link

Inference fails on large images on classifier endpoints #118

Closed nathanielrindlaub closed 8 months ago

nathanielrindlaub commented 8 months ago

With larger images (e.g. >3 MB, like those that Browning produce, see attached example), the images will successfully get predictions from the MegaDetector endpoints, but they consistently fail with the following error on both the MIRAv2 and NZDOC endpoints:

Error: ValidationError: Request {request_id} has oversize body.

All endpoints have their maximum MemorySizeInMB set to 6 GB.

My best guess at the moment is that the pre-processing steps of cropping out the bounding box are inflating the memory-usage significantly, as it seems like this person was experiencing & described on StackOverflow: https://stackoverflow.com/questions/77537913/sagemaker-serverless-validationerror-oversize-body

For now I'm going to try to resize images on the animl-api-inference Lambda side before sending them to the Sagemaker Serverless Endpoints for classification if they are too large and see if that does the trick.

browning-test-with-serial-number

nathanielrindlaub commented 8 months ago

Implemented conditional resizing upstream on the animl-api-inference lambda here: https://github.com/tnc-ca-geo/animl-api/commit/11f0230b0fd420f03e8cb7f58cdc47b3ec4376a8