Open mzeinstra opened 9 years ago
There is no problem with encoding itself. This problem is connected with using of one ID Rijksmuseum_SK-C-5
with more than one image with different URLs. This ID is used in batch n.2 and n.4. The tiles of both different images with same ID remains in memcache on IIIF server. And they are mixed there, IIIF doesn't have any concept of triggered flushing of cache. I can only flush cache manually. It probably could be developed or setup differently in #57
The best approach is to use one ID with one url all the time.
Same situation can happen if images are reordered in sequence. #37
Interessting, that is not how I read the Wiki:
'New records, and existing records that changed their "url", field will be added to a background queue that downloads source images and transcodes them to JPEG2000 format appropriate for serving via the IIIF Image service.'
I expected that when I give a different URI than that would replace the image. Not mix the two images.
@klokan do you consider this behaviour by design or as a bug?
What is described in the Wiki happens. It may just take some time until the new image appears on the web to the user.
The background info:
There is a trade-off between the performance of the image service and possibility to update an image under one ID (URL). If you want fast and scalable service - then you cannot change images for existing identifiers too often because of caching.
Multiple caches on different places are utilised for better performance (and all needs to be flushed on the update of an individual image - and ideally in the same moment - which is almost impossible across multiple machines):
The cache would need to be flushed bottom up (first on all iiif machines in the local S3 file cache, then in the processes, then in memcached, then wait until it happens in users browsers, etc). The iiif machines would need to get via a trigger (webhook) an information about the list of files (IDs or URL prefixes) to flush. We have no infrastructure proposed for this in the diagrams in the wiki and in the specification of the project.
We have been discussing this internally month ago - and result was to not put extra effort in this direction - especially after we have implemented the whole sequences extra. BTW this is related to #37 as well.
The use-case of updating an image for single identifier is very rare - and if it happens the caches are going to be flushed automatically after certain time. In case of extremely bothering situation all the caches can be flushed on all places manually.
To solve this properly - a new endpoint "/delete" to the running "iiif" servers would need to be added (a FastCGI inside of supervisord). Such endpoint would accept a file name (URL prefix) which should be flushed from all caches (local file cache and local memcached on the machine).
The task which do ingest update operation (change on the URL for known ID) would need to trigger the delete endpoint with the file which should be removed - an all machines started in the load balancer.
Let's mark this as an enhancement.
@o1da
I expect it means deleting the file cache of s3fs and restarting memcached on every iiif virtual machine running behind the load-balancer.
It will hurt performance of the service temporarily, but it solves the problem, if it appears.
Cleaning of caches is done and described in the installation protocol.
Hi,
I've uploaded 2 versions of the Night Watch to compare image qualities
It seems that the encoding of the first one when horribly wrong :)