boramalper / himawaripy

Set near-realtime picture of Earth as your desktop background
http://labs.boramalper.org/himawaripy
MIT License
1.62k stars 239 forks source link

there are eighteen processes named himawaripy #73

Closed longgangfan closed 7 years ago

longgangfan commented 8 years ago

Is it normal that there are eighteen processes named himawaripy keeping running?

longgangfan commented 8 years ago

I have resolved the problem by changing the contab line "/10 * * * * /usr/local/bin/himawaripy" to "/10 * * * * /usr/local/bin/himawaripy >/dev/null 2>&1"

longgangfan commented 8 years ago

The solution seems does't always work。

lachlandcp commented 8 years ago
longgangfan commented 8 years ago

Sometimes these processes will keep staying in background with nothing to do, because the last crontab cycle has past several minutes ago, even when the next cycle has began the "himawaripys" of the last cycle still stayed there.

boramalper commented 8 years ago

@imnofox But how come there can be 18 processes?

We create a process pool like this:

p = Pool(cpu_count() * level)

so 18 minus one main process is equal to 17, which is a prime number (i.e. cannot be the product of cpu_count() * level).

I was using systemd, but will try with cron and see if I can find a fix.

lachlandcp commented 8 years ago

@boramalper Somehow (likely in a tired state) I totally misinterpeted the code. Please ignore my comments 😵.

boramalper commented 8 years ago

@imnofox No problems, I deeply appreciate your interest. :)


@longgangfan Can you please log the cronjob you set for the himawaripy and share it after like an hour? You can see how to do it here.

Sorry for the inconvenience, hope I'm not asking for too much.


I have two theories about the issue:

  1. Either it's because we don't close the process pool explicitly at the end of himawaripy
  2. Or because an exception raised in a downloader process (or processes) make them zombie.

The problem is the Python documentation is not clear.

For the first case it says:

terminate()

Stops the worker processes immediately without completing outstanding work. When the pool object is garbage collected terminate() will be called immediately.

So actually we shouldn't have to call close() and join() at the end, but apparently we may need to.

The second case seems to be more likely, so I'm kindly asking for logs. :)

longgangfan commented 8 years ago

I attach a log file of crontab. I think the problem is caused by the unsuccess from the downloader process. The network connection condition varies with time, when the connection is bad, the problem will happen. And I have noticed that it created 9 processes named himawaripy every cycle, but if the downloader failed the job, all the 9 processes would hang there and with nothing to do. After a long time there will accumulate many(9,18,27...) himawaripy processes in the background. All these processes seem to take no CPU resource but do take some MB memory resource. himawaripy.log.txt

boramalper commented 8 years ago

This commit should solve the problem. Can you confirm @longgangfan?

longgangfan commented 8 years ago

I am so sorry to tell you that the problem is still there.

boramalper commented 7 years ago

Hello,

v2 of himawaripy uses threads instead of multiple processes so hopefully this time it won't occur again (fingers crossed).

Can you please make a clean installation of v2 (https://github.com/boramalper/himawaripy/tree/v2) and let me know if it fixes your problem?

Thanks!

longgangfan commented 7 years ago

Thank you for your improvement. I have make a clean installation, and been testing it for 110 minutes by the crontab way. Everything works fine by now. In the past two months, I stopped using himawaripy for this problem, but I think I will use it again. thank you!

boramalper commented 7 years ago

Sorry for the inconvenience then, glad that it solved your problem! :)