brokkr / poca

A fast, multithreaded and highly customizable command line podcast client, written in Python 3
GNU General Public License v3.0
23 stars 4 forks source link

Multiprocessing #45

Closed brokkr closed 7 years ago

brokkr commented 7 years ago

Set up a socket for receiving feed updates and fire off one process for each subscription. The feed processes report back to the socket. On that socket runs a single, serial downloader that processes the updates (little Wanted+Unwanted+Lacking etc. packages). The processing includes deletes, downloads, and reports to user. The downloader/main process simply deals with the updates in the order they appear on the socket, i.e. more responsive servers will get first in line.

The proposed distinction between multiple update processes and main process is identifiable in the current code as that between 'plans' and 'execution'.

Since the downloading will still be serial, multiprocessing won't accomplish much in terms of speed gains but it should minimize 'lag' and waiting. We stay away from parallel downloads partly because each dl would steal bandwidth from the others, partly because most updates won't see multiple downloads if your average user subscribes to say 10-20 podcasts and update once an hour (assuming). Finally and most importantly, total multiprocessing invites far more chaos when things go wrong and would require a greater ui rethink.

brokkr commented 7 years ago

Since each worker would only be running a feedparser instance and doing nothing too cpu intensive, multithreading should be fully sufficient. And instead of implementing our own network sockets and protocols, maybe we just use Queue, yes?

Outcome returns obviously need to be modified to accomodate new structure.

brokkr commented 7 years ago

Q: When we're running down the queue how do we know when we're done?

In other words: If the queue is empty is it because a thread hasn't returned yet or because all threads have ended and we're through.

We know that all threads have ended when the

for t in threads:
    t.join()

loop is over. But we don't want to wait for that before starting to process threads. What we need is to check if there are 'living threads' still each time we end one subscription.

EDIT: No, we don't.

We don't even need to lock anything down as these are all in the main process:

q.unfinished_tasks + finished_tasks == len(threads)
brokkr commented 7 years ago

Output: