loris-imageserver / loris

Loris IIIF Image Server
Other
208 stars 87 forks source link

possibility for wrong file in Loris cache using customized HTTP resolver #524

Open VlastaAIP opened 3 years ago

VlastaAIP commented 3 years ago

Greetings,

we are using customized Loris IIIF server on latest version 3.2.1 with custom HTTP resolver, which is downloading images to HTTP cache with our prefix. Loris cache name directories are also customized + after saving to JPG it is enhanced by PyExiv2 from the downloaded one.

We are using image server for our aplication, which can takes a lot of requests for images, but there is chance to happen, that the image stored in Loris cache can be from completely different request. It was happening in the previous Loris2 version, but I noticed that same issue in latest version. In HTTP cache, the images is OK, but the problem can happen on Loris cache - for example default.jpg or info.json in the correct possition, but from different request. How is this possible? It is not happening usually but like 2% of requests. We are using latest modules from requests.txt. Also dont know, if its causing Apache2. We are using default configuration with 10 processes and 15 threads. Can be this fixed with using single process and thread for Apache2?

bcail commented 3 years ago

@VlastaAIP ok, so the downloaded images are fine, but the images that Loris generates (the derivative images) are getting mixed up in the cache?

When you get an incorrect image in the derivative cache, is it the correct image but with the wrong size/region/...? Or is it a completely different image altogether?

Are you able to give us an example of a couple URLs that get mixed up?

VlastaAIP commented 3 years ago

Yes, they are getting mixed and it is completely different image. I will send you some examples, when it will be possible, because we fixed the latest ones.

VlastaAIP commented 3 years ago

We might get the clue on Apache threading. We have Loris on test server. For example this happened: This page is OK: http://5.102.53.70/loris/CJSMSB-CJSMSB4ODDILLISACI3798ZAF-cs/ID0148/full/full/0/default.jpg

This page is from wrong document: http://5.102.53.70/loris/CJSMSB-CJSMSB4ODDILLISACI3798ZAF-cs/ID0149/full/full/0/default.jpg It is from this: http://5.102.53.70/loris/AIPDIG-AMBFSK49_88_______4F2PNN8-cs/ID0037V/full/full/0/default.jpg

When we turned off multithreading on Apache (thread=1), problem might goes off. We have Ubuntu16.04 with Python3 and its WSGI version. Dont know why our Loris is incompatible with multithreading.

bcail commented 3 years ago

that's good turning multi-threading off maybe fixes it. If that does fix the issue, you're welcome to open a PR adding a note to the docs.