nleroy917 / spottydata-api

Backend and Web API for spottydata.com
MIT License
7 stars 0 forks source link

Parallelize the Audio Analysis Function #18

Closed nleroy917 closed 4 years ago

nleroy917 commented 4 years ago

Audio analysis on the end of Spotify is pretty slow - if I can split the job up into 2 separate process, or 4 separate processes, then the calculation time becomes exponentially faster... I don't think I need to worry about rate limits for this endpoint and I am pinging the server as fast as I can anyways.

This would greatly help improve UX since the currently rate limiting step is how fast I can generate an audio analysis for a song.

TODO - Find a way to gracefully parallelize the audio analysis function within lib/track_analysis

(Will require threading I think)

nleroy917 commented 4 years ago

Check out this page: https://stackoverflow.com/questions/48082264/implement-parallel-for-loops-in-python

nleroy917 commented 4 years ago

In addition, I found this about gunicorn and it's ability to handle simultaneous requests:

How Many Workers? DO NOT scale the number of workers to the number of clients you expect to have. Gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second.

Gunicorn relies on the operating system to provide all of the load balancing when handling requests. Generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with. While not overly scientific, the formula is based on the assumption that for a given core, one worker will be reading or writing from the socket while the other worker is processing a request.

Obviously, your particular hardware and application are going to affect the optimal number of workers. Our recommendation is to start with the above guess and tune using TTIN and TTOU signals while the application is under load.

Always remember, there is such a thing as too many workers. After a point your worker processes will start thrashing system resources decreasing the throughput of the entire system.

nleroy917 commented 4 years ago

I am dumb... I just found this route GET https://api.spotify.com/v1/audio-features to get several features at once instead of getting one at a time. I hope this will improve performance drastically. Will update when I re-write the analysis gathering route.

nleroy917 commented 4 years ago

I was able to substantially decrease the analysis time by hitting Spotify's multiple analysis and multiple artist routes - This also helps me avoid rate limits