ContactEngineering / topobank

Create, visualize, analyze, share, and publish digital surface twins
https://contact.engineering
MIT License
8 stars 3 forks source link

Upload progress bar jumps weirdly in production #794

Closed pastewka closed 2 years ago

pastewka commented 2 years ago

The progres bar jumps back to zero at intermediate steps. This is identical to what we see for the analysis progress bars. It does appear to only happen in production, not when developing locally.

pastewka commented 2 years ago

See #755

mcrot commented 2 years ago

Now I've seen the following: During upload, the browser requests the current progress state by short intervals. Normally, these HTTP requests result in JSON data with keys size and received, but sometimes the result is just null: image

I think this null result marks the moment when the bar vanished for a second. When looking into the code of uploadprogressbar this means, that the cache key, which is built from the progress_id, was not found in the cache. I think this could be a hint for the other cache problems.

mcrot commented 2 years ago

It seems that this jumping was induced by using multiple cache instances. Sometimes the progress ID was included, sometimes not. The progress bar no longer jumps with a single cache server. Nevertheless, the upload of a big files stops with an exception on the server side. But that is another problem. So I think this here is solved. I'll keep it open until the compose file is changed accordingly.

pastewka commented 2 years ago

Okay, great! The larger file issue seems to be related to size limitations of memcached. I think those should go away if we switch to redis, which does not have those limitations.

mcrot commented 2 years ago

It seems to also happen with smaller files. Somehow the memcached server is terminated from outside, probably by Docker Swarm, maybe due to instabilities in the internal network. If this is the case, Redis would be frequently terminated, too. But I'll give it a try, we will know more afterwards.

mcrot commented 2 years ago

This seems to be solved after simplified setup with redis and probably more relevant, fixed a wrong configuration with supervisord.