Closed Jeffreydaw closed 5 years ago
This problem won't go away.
today I rolled out v2.3.2 and after doing so I dumped the nginx-cache now I'm seeing the tiling problem below. For GIOPS Daily for the 4th of April 2018.
Before dumping the cache I was also seeing the problem.
maybe a python cache problem?
As can be seen here even when the day moved to the 6th of April the problem still is present (though something with the tiles has changed). It is important to note that this problem does not happen when run locally (tested in dev mode)
Note: The problem is present on the 4th and the 6th but does not seem to be present on the 5th.
The 6th of April 2018 corresponds to the time stamp 758 but the cache has.
... 754/ 755/ 756/ 757/ 758/
I was able to reproduce the problem on my computer by copying the python cache from the server to my computer.
This was the result for the 4th of April 2018 day index 758 clearly, it is the same as the 4th in the post above. when this was done the files in the cash were not overwritten so it seems clear that these are the correct files. Also, I stitched the images that were in folders in the cache directory together (drag and dropped into libre office ) and this was the result.
The last thing I tried was viewing the same tile as displayed by the navigator alongside the one in the file. they were the same.
I believe it is safe to say that, unlike issue #66 this is not a nginx caching problem. I think it can even be said that the source of the problem is not the python cache (though it stores the problem after it comes up). This issue is looking more and more like a race condition as has been suspected for some time. The problem is now to find out whether this caused by python doing multithreading or with the uWSGI Emperor.
It was also thought that there could be some issue with mkstemp. But it looks like the files that get saved get deleted or moved right away so there should only be one mkstemp file in the folder at a time (this has only been tested in dev mode on my computer).
This problem has now also been seen in the artic projections.
this has been moved to the icebox because remaining debugging methods have been reduced to a few options that are going to take some work to set up and they may not provide results.
deleting the cash resolved the problem as usually though it is still present on April 27th
I have tried running the uWSGI server on my computer using GUnicorn I loaded about 60 days in the navigator trying to force a race condition. I tried: loading multiple pages at once, loading a page and interrupting it, loading the same day in may tabs at the same time, trying across different browsers at the same time (opera, and crome) trying the arrows and the calendar to select the date, trying to use the API to load one tile and the loading the navigator
I finally got the tileing problem in local host. This happened the day after the local test with the uWSGI server running localy. The dates that have the problem are may 2nd and 3rd
the tiling problem returned. it was noticed that the there were request in the log file for future time indexes, this is a problem as it creates the bad files that are cached.
I thought this problem was gone due to change @NoahGallant-MUN made.... maybe not :(
I think the return of this problem was linked to the problem with the misplaced file for issue #353 I deleted the cash and it is working again.
I am moving this issues to "Done" as I think Noahs fix works.
closing because I think this is mostly solved, it may have to be reopened at a latter time
The tiling problem from issues #66 is back. It is not appairing in GIOPS forecast, instead it is in GIOPS daily (like bug #66 was thought to have been when first logged).
Bug #66 was a nginx caching problem, however, I have been able to verify that the problem in GIOPS daily is not a nginx caching problem. I checked the python cache and the image there matched the tiles being displayed in the navigator, just some of them were wrong. The files were all also created at the same time, sometimes on the same minute. I then deleted the tile in the Nginx cache and it came back the same as it was (bad). I followed that up by then deleated one of the tiles in the python cache and the nginx cache at the same time, when doing this the tile filled in correctly and was regenerated in the python cache.
this leads me to believe that this is a Race condition either with the python doing multithreading or with the uWSGI Emperor. I'm not sure which. and more testing is needed to be sure.
this is present in version v2.1.3