kartoza / geosafe

InaSAFE package for Geonode
GNU General Public License v3.0
7 stars 16 forks source link

Many analyses not completing #520

Open gubuntu opened 5 years ago

gubuntu commented 5 years ago

problem

Many analyses are not completing

proposed solution

troubleshoot and fix... Screenshot 2019-01-07 at 22.48.56.png

lucernae commented 5 years ago

Hi @gubuntu , downloaded the layer and tested it in my local env. I can replicate this consistently if I'm using a huge bounds. I think that's the cause (analysis extent is too large). I'm suggesting we should change the focus to informatively tell the user why an analysis is failing. At the moment we record the stack trace, but it only contains information that the anlysis is crashed (probably huge memory usage is the cause), so it's not really informative for the user. We don't know a way to flag analysis as failing too. But periodic task check might solve that.

So, in summary, I had to implement more robust task tracker (the issue is in the backlog #383). It's not going to fix the issue, but provides more information on why a task is failing. Then we can check individual cause for each task. But this is going to be a huge feature to implement.

gubuntu commented 5 years ago

@NyakudyaA can you please emulate a few of the stauck processes on the desktop and see how long they take or if there are any other issues?

NyakudyaA commented 5 years ago

@NyakudyaA can you please emulate a few of the stauck processes on the desktop and see how long they take or if there are any other issues?

@gubuntu Will attend to this and let them run. But my machine is a little beast. It might outperform the server

lucernae commented 5 years ago

guys, @timlinux @NyakudyaA , this issue is related with https://github.com/kartoza/geosafe/issues/518 https://github.com/kartoza/geosafe/issues/383

But, honestly it is such a big block of features if we want to implement the whole diagnostics in GeoSAFE. I was hoping that maybe we could at least show some relevant information for the user, at least to provide some feedback so user knows what is going on first. Now GeoSAFE hides celery complexity so it only shows that a task is pending, but doesn't tell the progress (because we don't have a way to get it in the first place). What I proposes is to have an Analysis Detail page again, but fill it with some useful information on how to replicate the analysis on Desktop, so user like @NyakudyaA can replicate it.

We have Analysis Detail page currently, but we only put it for developers like me to debug analysis error. It might be good to add some several details like:

I am suggesting this because I already tried rerunning the analysis again and again and it really took a long time to complete and I set up a cron schedule to always restart InaSAFE Headless daily, to make sure celery worker is always fresh (no memory leak). This mean what really happens is those pending analysis took more than a day to complete and got restarted every day, so it never completes.