Closed Armory8854 closed 1 year ago
DOWNLOADS should use threading but CONVERSIONS should probably use either multiprocessing or async. Determine which one is best?
To elaborate, I have tested threading on the opusconversion step, and quickly deduced that I either need to
A) Limit max threads B) Run downloads concurrently but run conversions in a more linear fashion
The cpu load is very heavy for converting files, especially multiple files at once. This may even be something I remove from the program depending on how I end up settling on this.
While researching ways to speed up the download process, I came to the realization that I could be running multiple instances of newPodcastDownload() at the same time. This would involve a few new additions and concepts:
The hardest one for me to conceptualize here is number 3. My first thought is to query the database and store all of the key / values in a list of dictionaries, and then pass that to the download function. This way, we don't have to call the database every single time we want to run a download.
This does introduce another problem to me - how do I pass which podcasts were downloaded or not to the final database call? My first idea now, before doing research, is update the dictionary with a downloaded 0/1 key value pair. I would just update the dictionary as I download.
Final question: How do I ensure the downloads don't pull the same values? As in, if the process starts running 3 times, will it try to download podcast 1 3 times at once? If it sets the status to downloaded 0/1 as it's pulled, it should remedy this. Possibly even add a new database field called
attempts
that counts up every time a podcast tries to download. Some way to measure that we should skip the failed download this run, but save it for a run 2.