DHI / terracotta

A light-weight, versatile XYZ tile server, built with Flask and Rasterio :earth_africa:
https://terracotta-python.readthedocs.org
MIT License
691 stars 74 forks source link

In some data i get different color image tiles apart from true color . #339

Open anup39 opened 5 months ago

anup39 commented 5 months ago

Screenshot 2024-06-06 at 17 14 35

dionhaefner commented 5 months ago

Do you see any errors in the server log?

anup39 commented 5 months ago

In flask application there are no errors. The endpoint is serving image properly. However i am using gunicorn and nginx to serve the application. Whenever i increase the worker count i am not getting this issue.

Screenshot 2024-06-07 at 08 46 42 Screenshot 2024-06-07 at 08 49 14
dionhaefner commented 5 months ago

What's your setup? Assuming the server is running on a single VM? Where are the images (local disk, S3, ...)?

anup39 commented 5 months ago

I am running in Azure VM with 4 CPU core, RAM 16 GB and with 128 GB SSD for storage , Yes the tifs bands are in my local storage.

j08lue commented 5 months ago

Looks like one of the bands is failing or gets an unexpected offset.

Whenever i increase the worker count i am not getting this issue

That is strange. Is that behaviour consistent? And at which worker size?

And does it change when you disable multiprocessing with USE_MULTIPROCESSING=false?

Thinking about this part here - reading each band in its own process...

https://github.com/DHI/terracotta/blob/8632e1c598a22de2d10c6cdb885cd4f1b6fe9d8e/terracotta/drivers/geotiff_raster_store.py#L29-L48

If one of the bands fails, the whole tile production should fail, though. And Gunicorn workers should not affect Python processes, should they? 🤔

Sorry for all the guesswork. Are you sure your file is ok, @anup39? For example, try opening it in QGIS or so and zoom in and out.

dionhaefner commented 5 months ago

If one of the bands fails, the whole tile production should fail, though.

Correct, so this could only happen if the data being read look OK but is faulty (corrupt overview, data being truncated while read), or through a race condition or cache pollution somewhere. Which would be almost impossible to debug without a reproducer...