jurialmunkey / plugin.video.themoviedb.helper

GNU General Public License v3.0
203 stars 96 forks source link

[Request] Increasing content loading speed #402

Closed matke-84 closed 3 years ago

matke-84 commented 3 years ago

The only real complaint with this fantastic addon is its speed of loading content. If we compare the loading speed of tmdb helper and last version of elementum, we can see that in elementum the content is loaded several times faster, almost immediately. The difference is quite noticeable if we use the ability to load fanarts. The difference is especially noticeable on android devices but also on pc. It would be really nice if the loading speed was improved, and it is possible, we see this with an elementum that loads the same content as tmdb helper. Happy Holidays :-)

jurialmunkey commented 3 years ago

Turn off Fanart TV requests for plugin/widgets. I've done a lot of benchmarking and by-far-and-away time cost comes down to request times for online services rather than any processing time in the plugin itself. With Fanart TV enabled, you are adding an additional online request per item in the list. So if Fanart TV takes 250ms per request (which is generous - at times it can be far worse), then we are adding 5 seconds of additional time per list just waiting for their server to respond to the request.

Turn off Fanart TV and load a TMDb list like Popular Movies and it should almost load immediately (it takes less than a second for me).

Also are you sure you are comparing like-for-like features? I just installed Elementum and all its lists look to be 10 items whilst TMDbHelper shows 20 items, so we are looking at double loading time for every list automatically simply by virtue of number of items (and thus requests required).

Things that increase load time significantly:

Things that add a bit of load time but not much:

jurialmunkey commented 3 years ago

Also keep in mind that Elementum ships compiled binary code, which in terms of performance is impossible to compete with when using an interpreted language such as python (which TMDbHelper uses). The issue with binary code is that it is opaque and thus the user must trust that the person who compiled the code did not inject anything malicious into it, whereas with an interpreted language the code is transparent and can be read with a simple text editor.

To be fair to Elementum, the uncompiled code is open source and available on Github, but the user is still required to trust that code.

The biggest issue I've encountered with python is that the requests module is not very efficient both in terms of initial loading and also in terms of doing actual requests.

jurialmunkey commented 3 years ago

To give some context, here's a benchmark that I ran on a list from TMDb where I had a mix of movies with cached Fanart TV requests and others I hadn't looked up before:

Cached Fastest: 0.001 sec - FanartTV.get_artwork_request./65203/movies
Cached Slowest: 0.021 sec - FanartTV.get_artwork_request./297762/movies
Cached Average: 0.003 sec

Online Fastest: 0.311 sec - FanartTV.get_artwork_request./149/movies
Online Slowest: 0.807 sec - FanartTV.get_artwork_request./185/movies
Online Average: 0.551 sec

All this function does is check the cache for previous response and return that or otherwise do the online lookup. There's no other processing here - it is just the cache check and online request.

As you can see, the online requests to Fanart TV take on average 550ms to complete per request and can take up to 800ms; whereas to retrieve from the cache takes ~3ms. And I have a reliable 50Mbps HFC cable connection running on an i7 with SSD so no performance bottlenecks -- All I'm timing here is the retrieval from the API, it just takes the server that long to serve the request, I can't make it go faster. So realistically we are actually looking at more than 10 seconds just to retrieve Fanart TV requests for a single page of items and that's before any processing or other look-ups.

Also, using cURL from command line (which is just about as fast as we're going to get), I still wasn't able to get it down below 250ms -- and with the 800ms request above (which is Clockwork Orange) I was only able to get down to 550ms, so the bottleneck is the Fanart TV server and nothing else.

jurialmunkey commented 3 years ago

And one last report comparing loading pages of same list with fanart tv turned on and off. As you can see the Fanart TV requests add a lot of overhead - we go from a list loading in almost 20 seconds to a list loading in 1 second.

====================================================
TIMER REPORT (fanart.tv enabled)
----------------------------------------------------
0.510 sec - Initial API Request and Metadata Parsing
1.271 sec - The Bridge on the River Kwai
0.881 sec - Me Before You
0.867 sec - Carlito's Way
0.866 sec - Mary and Max
0.883 sec - Downfall
0.817 sec - Elite Squad: The Enemy Within
0.901 sec - Spotlight
0.716 sec - Short Term 12
0.749 sec - Land of Mine
0.650 sec - Brief Encounter
0.833 sec - Presto
0.866 sec - Marriage Story
1.150 sec - Finding Nemo
0.883 sec - Midnight Sun
0.916 sec - Donnie Darko
0.884 sec - Brokeback Mountain
1.149 sec - Harry Potter and the Goblet of Fire
0.783 sec - The Hustler
0.883 sec - Monty Python and the Holy Grail
0.867 sec - Dog Day Afternoon
0.002 sec - Next page
----------------------------------------------------
AVG      0.848 sec
TOTAL   18.329 sec
====================================================

====================================================
TIMER REPORT (fanart.tv disabled)
----------------------------------------------------
0.529 sec - Initial API Request and Metadata Parsing
0.376 sec - I, Daniel Blake
0.033 sec - Ip Man
0.004 sec - Hedwig and the Angry Inch
0.002 sec - A Bronx Tale
0.028 sec - Coraline
0.003 sec - Fiddler on the Roof
0.002 sec - Die Hard
0.002 sec - The Last Picture Show
0.002 sec - The Gentlemen
0.002 sec - Harry Potter and the Half-Blood Prince
0.002 sec - Hotel Rwanda
0.014 sec - Edward Scissorhands
0.007 sec - The Ten Commandments
0.003 sec - Ponyo
0.002 sec - Harvey
0.002 sec - The Wild Bunch
0.002 sec - The Killing
0.002 sec - Rush
0.010 sec - Kubo and the Two Strings
0.001 sec - Moonrise Kingdom
0.001 sec - Next page
----------------------------------------------------
AVG      0.024 sec
TOTAL    1.027 sec
====================================================
jurialmunkey commented 3 years ago

Actually this has got me thinking about a way to improve speeds and I think I might have a good idea that from preliminary testing is yielding some good results. Test code incoming soon!

matke-84 commented 3 years ago

@jurialmunkey That's great news. I am here for any help or testing. Just as information, I tested tmdb helper and elementum with the same criteria. 20 movies and tv shows, with and without fanart.tv (elementum has the same option). It's just amazing to me that with fanart.tv option enabled elementum opens lists almost unchanged. Today almost everyone uses some skin, I use titan bingie mode, and just without fanarts the skins look very ugly. That is why it is important to increase that speed as well. It also elementum uses mostly trakt list, so it’s even more fascinating to me that it opens so quickly. But I have to tell you that until the last update, the elementum also had a delay in opening. I don't know what exactly was done, whether it increased the number of connections to the servers in the code, but now it works great. I hope you succeed too. That would be a great New Year’s gift for all of us. :)

jurialmunkey commented 3 years ago

Today almost everyone uses some skin, I use titan bingie mode, and just without fanarts the skins look very ugly

Just FYI that TMDb supplies posters and fanart by default and you don't need fanart.tv enabled to retrieve them.

You only need fanart.tv if you rely on clearlogo, clearart, banner, discart or landscapeart.

matke-84 commented 3 years ago

@jurialmunkey That's the catch. Currently the most popular skin for movies and tv shows, titan bingie mode, without all that it makes no sense. I don't know if you're aware of how popular your addon is. All my friends, both real and virtual, use tmdb helper with this or that skin. Some skinners try to implement it in their skins because that will be the real thing. And everyone who uses skins wants both looks and functionality. This is especially true for "netflix" skins.

jurialmunkey commented 3 years ago

@matke-84 - Please try latest master (v4.1.0) - Should be getting very much improved times as the individual requests are now threaded.

The overall performance appears to be much improved. I turned on everything and tried a 20 item Trakt list with nothing cached (so we're looking 3x request per item - one from Kodi DB, one from TMDb, and one from Fanart TV) and the original load time of ~25 sec went down to about ~7 sec. For a TMDb list it went down from ~18 sec to under 2 seconds!

There is now a slight performance cost if all requests are already cached because we're now adding the extra overhead of starting a thread for each item and then having to reconstruct the list order once the pool completes before sending to Kodi. However, the overall benefit is so huge when doing online lookups (completely uncached we're looking at somewhere between 350% to 900% increase for lists) that a minor cost if item are cached is not really of consequence, especially since cache cost is so low already (even if it quadrupled we'd still be only looking at less than a 200ms increase overall).

Like I said before, using Python I wont be able to compete with precompile binary performance. The biggest issue is that Python doesn't really have any true multiprocessing, so even with threading we aren't using multiple processor cores. Basically we can get much improved performance dealing with Network bottlenecks (i.e. retrieving the data) but not with CPU bottlenecks (i.e. processing the data we retrieved).

matke-84 commented 3 years ago

@jurialmunkey Ok, I tested. Devices: Android tv box Internet: 50/50 Skin: Titan bingie mod Settings: Open 20 movies or tv shows, enabled fanart.tv option, trakt lists (because of elementum) The average opening time with all options included is 15 seconds sometimes shorter. Elementum same content with same settings 6 seconds. When we talk about loading seasons and episodes there is a better situation. They open almost identically. The difference is in a few seconds in favor of elementum. Tested with kodi installed from scratch, without cache, thumbnails... When data is cached then it goes much faster. If we compare tmdb helper with the previous version there are improvements but it is still slower than elementum. As I said, there was a delay in the elementum until the last update. I don't know how these loading speeds in the elementum were obtained, but they are impressive. But certainly this is much better. Thank you for trying to improve this addon with each update. One suggestion, try to implement the option to reduce the number of loaded movies or tv shows, it also exists in the elementum. We will get at the loading speed. It would also nice if there is an option for the user to choose what he wants to include from the data (movie duration, mpaa, realise year ...) or from fanarts (clearlogos, cleararts, cdarts...). I seem to have seen it this option in some scraper. This will allow everyone to choose what they want. It is always better to have a choice. Less data higher speed, more data a little slower. Everyone will be satisfied with that, and to those to whom details are important and to those who are not. I'm sorry, I'm mentioning some things here that I'm not even sure are possible, but they seem very interesting to me as an end user.

jurialmunkey commented 3 years ago

If we compare tmdb helper with the previous version there are improvements but it is still slower than elementum. As I said, there was a delay in the elementum until the last update. ... I don't know how these loading speeds in the elementum were obtained, but they are impressive.

The speeds are obtained by using pre-compiled binary code which is optimised for specific platforms and using true multiprocessing.

Like I said before, an interpreted language like Python cannot compete with compiled binaries written in C++ or Go. Generally, however, it is preferred that Kodi addons don't use binary code unless absolutely necessary and instead stick to interpreted languages for transparency of code readability.

One suggestion, try to implement the option to reduce the number of loaded movies or tv shows, it also exists in the elementum.

Normally people ask to increase the number not reduce... It's not really something I want to make editable as people will just increase it and place unnecessary extra strain on these services.

It would also nice if there is an option for the user to choose what he wants to include from the data (movie duration, mpaa, realise year ...) or from fanarts (clearlogos, cleararts, cdarts...). ... Less data higher speed, more data a little slower.

Not true. This will not make much, if any, difference to performance.

The biggest time cost is server response speed, not the size of the data returned. The data returned is in small text files of only a few KBs, so attempting to reduce their size is premature optimisation that won't provide any noticeable performance increase.

For instance, data from Fanart.TV is about 20KB of uncompressed text, which on a 50Mbps connection should take max ~3ms to download. The other ~500ms of time taken is how long it takes for the server to react to the request and do a database lookup - it has nothing to do with the amount of data requested.

Making a car more aerodynamic is not going to change how long it takes to start the engine.

The options available are already the ones which if disabled have an actual impact on performance.

matke-84 commented 3 years ago

@jurialmunkey Ok, thanks anyway for everything. I am waiting for some other updates and improvements. ;-)

matke-84 commented 3 years ago

It would also nice if there is an option for the user to choose what he wants to include from the data (movie duration, mpaa, realise year ...) or from fanarts (clearlogos, cleararts, cdarts...). ... Less data higher speed, more data a little slower.

Not true. This will not make much, if any, difference to performance.

The biggest time cost is server response speed, not the size of the data returned. The data returned is in small text files of only a few KBs, so attempting to reduce their size is premature optimisation that won't provide any noticeable performance increase.

For instance, data from Fanart.TV is about 20KB of uncompressed text, which on a 50Mbps connection should take max ~3ms to download. The other ~500ms of time taken is how long it takes for the server to react to the request and do a database lookup - it has nothing to do with the amount of data requested.

Making a car more aerodynamic is not going to change how long it takes to start the engine.

The options available are already the ones which if disabled have an actual impact on performance.

@jurialmunkey I'm sorry, I forgot to ask you. Then because of this you can turn on the same feature for tmdb lists. Now that we have gained speed, I think that there is no reason for some information to be excluded, such as movie time duration, mpaa... but also to integrate fanarts for tmdb movies and tv shows lists such as clearlogo, clearart, cdarts...