AdmiralGT / botc-scripts

Blood on the Clocktower Script Database
https://botc-scripts.azurewebsites.net/
MIT License
19 stars 10 forks source link

Performance of the deployed app is poor #272

Open tchajed opened 1 year ago

tchajed commented 1 year ago

Describe the bug Loading the site is quite slow in the current deployment. For example, loading the home page or a single script each take almost 4 seconds for me.

I suspect this is due to the deployed server and not the code per se, but if there's something that can be done in the code I'd be happy to help. (I'm not a frontend developer, otherwise I'd offer to help with UI improvements as well.)

To Reproduce

Load the home page in a browser, or to remove some noise from page rendering and such run these from the command line:

time curl 'https://botc-scripts.azurewebsites.net/' >/dev/null
# 3.8s
time curl 'https://botc-scripts.azurewebsites.net/script/1280/1.0.0' >/dev/null
# 4.3s
AdmiralGT commented 1 year ago

Thanks for the report. If the issues are back-end related, I can only presume it's with the postgresql database as when running the webserver locally I still see similar sort of times downloading specific instances of scripts

admiralgt@Persephone:~/github/AdmiralGT/botc-scripts$ time curl 'localhost:8000/script/1280/1.0.0' > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 19514  100 19514    0     0   4473      0  0:00:04  0:00:04 --:--:--  4473

real    0m4.370s
user    0m0.010s
sys     0m0.000s
admiralgt@Persephone:~/github/AdmiralGT/botc-scripts$ time curl 'localhost:8000' >/dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 75774  100 75774    0     0  42402      0  0:00:01  0:00:01 --:--:-- 42402

real    0m1.795s
user    0m0.008s
sys     0m0.000s

I've turned on performance insights into the postgresql database but that will take some time to gather the results. Additionally, the public backend webserver is reporting 35% CPU average use and 70% memory usage which suggests it's not a bottleneck beyond just the number of queries that are happening but the metrics don't suggest the site ever receives significant load.

Additionally, the base API is significantly faster which is returning a similar amount of data/running a similar query to the main page.

admiralgt@Persephone:~/github/AdmiralGT/botc-scripts$ time curl 'https://botcscripts.com/api/scripts/' >/dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  6043  100  6043    0     0   5872      0  0:00:01  0:00:01 --:--:--  5872

real    0m1.037s
user    0m0.019s
sys     0m0.000s

From what I understand, the Django template language is quite slow and I suspect this is a significant cause of the slowness. I have run some code timings locally and the actual work I'm doing above the django layer takes very minimal time, even when iterating over the entire script database for the script page. Sadly, the debugging tools for the django templating language aren't very good so I really don't have much idea how to improve performance.

I suspect the generating of the templating of the thumbs/votes etc is a big cause but I don't have much idea how to resolve this as sadly I'm also not a front end engineer.

Sadly progress is likely to remain slow unless I can get additional help, but I also need to perform a database migration but that requires backups etc first, but hopefully that may also improve performance.

tchajed commented 1 year ago

Thanks for the detailed explanation! It does seem like something could be done and I'd be happy to take a look.

Could you give me a quick explanation for how to run the site locally? I tried to figure it out but I've never used Django or poetry so I couldn't guess what I needed to do.

AdmiralGT commented 1 year ago

Thanks for the prompt, I've had a long standing ticket for a lack of local developer environment. I'll write something up this evening and get it merged.

tchajed commented 1 year ago

Awesome, thanks!

tchajed commented 1 year ago

I was able to get things running using #186. However, even after populating the database with 1000 scripts (all copies of BMR with distinct names) loading one takes 150ms, and requesting JSON via the API takes only 32ms. This points to some other feature, like comments or votes, having an outsized impact on performance, even if the script being loaded doesn't have many, but I'd like to test that by using a more realistic database.

AdmiralGT commented 1 year ago

Interesting! My suspicion would be on the Character model because I suspect when it's used, there are lots of separate queries to the database to get the complete set of Characters or get a specific character instance, although my understanding of Django is that these sort of queries are meant to be cached such that an individual session doesn't keep repeating doing the query but it's quite possible some of the architecture here is incorrect.

Sadly I can't share the comments/votes portions of the database as it references user objects and that's not something I can share and the votes/comments won't make sense.

I can share though that there are 2107 script versions in the database linked to 1225 script objects. There's 1019 favourites, 650 votes and 53 comments.

Is there any chance you could start a PR with any tooling you've written to import 1,000 scripts? I suspect for this sort of performance debugging we're going to need to build out some model creation.

tchajed commented 1 year ago

I can turn it into a proper script later but this is a short fish script to do a single upload:

begin
  set -l file bmr-new.json # from script tool
  # only needed for first run
  curl -c cookies.txt 'localhost:8000/script/upload' >/dev/null
  set -l author (jq -r '.[] | select(.id == "_meta") | .author' $file)
  set -l name (jq -r '.[] | select(.id == "_meta") | .name' $file)
  set -l csrftoken (grep 'csrftoken' cookies.txt | cut -f7)
  curl -c cookies.txt -b cookies.txt -X POST localhost:8000/script/upload \
    -F csrfmiddlewaretoken="$csrftoken" \
    -F "name=$name" -F "author=$author" \
    -F "script_type=Full" -F "version=1" \
    -F content=@$file
end
AdmiralGT commented 1 year ago

I've installed some additional tools in https://github.com/AdmiralGT/botc-scripts/tree/grt/perf-improvements

There are a few key areas which seem to be the cause of the slowdowns and areas I think that should be focused on:

Main Page

Of the SQL queries, most of the time is spent in 2 areas:

  1. Simply counting all the scripts in the database. I believe this is impacted by the ScriptViewManager class trying to annotate the likes, favourites and comments. It drops from 300ms to 50ms simply removing this class.
  2. The Script Tags. Specifically getting the tags for each entry in the table.

This makes sense based on your investigation as you've added none of these types of objects so those portions probably evaluate quickly, but in the real environment there are a number of comments/votes/favourites/tags that have to be joined together.

The Template Profiler also shows a lot of time spent in tags.html (which is responsible for displaying tags) although that looks to just be caused by the SQL query to get the tags.

Script Pages

Of the SQL queries, the main time offenders are:

  1. Tags again, simply displaying them on the page.
  2. The Script Similarity calculation, this is actually querying the entire database of scripts so it may be time to start thinking about how to do this more intelligently as it simply loops through the entire list of scripts. Running locally it's taking 400ms.
  3. Counting the number of collections a script is part of.
  4. Going through trying to display all the characters and where to add the line breaks.

Sadly the Template profiler doesn't provide much information here, simply that nearly all the time is spent in script.html but I believe again the problem here is in the rendering of the character names.

Hopefully I can use this to redesign some elements of the database to hopefully improve performance.

alexanderfletcher commented 7 months ago

Do you have an example dataset or way to generate a decent amount of scripts? I'm sure there is some low hanging fruit here :)