AtlasOfLivingAustralia / image-service

Image repository and tiling services
https://images.ala.org.au
0 stars 17 forks source link

Add content-type: check to externally retrieved images #200

Open joe-lipson opened 2 months ago

joe-lipson commented 2 months ago

Rose has reported that with there are a significant number of images in a large (200k records) upcoming Australian Museum images ingest where the image URL they provide is invalid. We can let AM know about these records and it's up to them to fix it but what seems to be happening is when the image service goes to fetch the image the AM servers come back with a 200 OK but the page is text. The image server then seems to be saving this text as a file and presenting it as an image with shows as broken in the UI.

An example of a good image coming from AM is http://203.22.224.10/collection/imu/request.php?request=Multimedia&method=fetch&key=1112947&filter=width:bf:400

A bad one, this is still a 200 OK http://203.22.224.10/collection/imu/request.php?request=Multimedia&method=fetch&key=1112347&filter=width:bf:400

And this is how a bad image looks after ingest https://images-test.ala.org.au/image/384ed76f-6961-450b-97b6-e23250807c08

https://images-test.ala.org.au/?q=&fq=dataResourceUid%3Adr340&offset=150&max=50&sort=dateUploaded&order=desc

Would it be possible for the image-service to check the content-type: header for external images and reject bad ones, seem like this would be a good thing to have for all image ingests