Closed AndrewVSutherland closed 5 years ago
I just added tags to application areas where the top level page performance is over the 400ms metric I mentioned (I'll happily take them off once they don't :)). The main point is that this issue potentially touches all areas of the LMFDB and is not specific to the "backend", although there is certainly work to be done there.
But if you then stop and do a handful of requests 10-15 minutes later you have a good chance of seeing slow response again. I don't know why ...
This is because those values are cached in the worker instance. The worker are restarted after a certain number of requests -- as defined in the gunicorn file -- because otherwise at least two problems arise: memory leaks, and gap stops working (has its own strange limits).
That's why such statistical values should be computed "offline" and cached in the database. E.g. every 10 minutes, a small python script is run via a cronjob, that executes those queries to gather all interesting statistics, and then stores this in a table "stats". This page then simply pulls out the relevant document and displays the numbers.
+1 @haraldschilly Thank you for the explanation! This explains the behavior I was seeing.
I agree 100% that statistics should be cached in the database. Several of us have suggested the same, but I think some were operating under the false assumption that instance caching makes this unnecessary.
It does seem strange to me that instances should be getting restarted so quickly. Surely the memory leaks are not so bad that we need to restart an instance after a thousand requests (in just 10 minutes). Is it really that bad?
Do you know how the current gunicorn values chosen? Is there reason to believe they are optimal?
I hadn't realized that the instances were automatically restarted after a certain number of requests. But they are also restarted anytime that they fail to respond to some request quickly enough (within 20 seconds, maybe) or anytime they try to use too much memory, or anytime they crash for some other reason. (All of these things happen too much.)
I would like to deal with performance issues in general by using some caching outside of our python processes. For the Warwick and Bristol setups, incoming requests go through apache first, which forwards them to gunicorn, which forwards them to our code. It is possible to have apache cache the response, so that when it gets the same URL again it doesn't need to pass the request to gunicorn, but can instead send exactly the same response that it sent before.
There's some complication, which is why we haven't been doing this. (We did try such a setup a long time ago, though.) For example, we set cookies and certain things can only done by users who are logged in. One possibility is to make www.lmfdb.org cookieless, and force users who want to log in to use the address for the Warwick server directly, or some other address set up for this purpose. Since that's a special use case, it wouldn't effect performance much. (There is also a cookie set to remember the show/hide menu setting.)
Do you know how the current gunicorn values chosen? Is there reason to believe they are optimal?
Well, there is certainly nothing "optimal" with them, and I don't even know what the values are right now. I added that, because I noticed that sometimes pages had random internal errors and did track this down to either broken subprocesses or that gap did bail out. That's why I told gunicorn to restart. As far as I understand, its forking from a master process -- so that should be quite fast.
Whatever the value is, start by increasing them by factor of 10x and keep an eye on the workers.
@jwbober Regarding cookies, I already find that editing knowls on www.lmfdb.org is painfully slow and have switched to using lmfdb.warwick.ac.uk for knowl editing anyway and we could just tell others to do the same. I don't think making www.lmfdb.org cookieless would necessarily be a bad thing (and it would avoid triggering warnings for users whose browser settings alert them to cookies).
FWIW, I have only been able to edit knowls from lmfdb.warwick.ac.uk for timeout reasons. For the future, we should try putting lmfdb.org on a cookieless diet.
Regarding cookies, I already find that editing knowls on www.lmfdb.org is painfully slow and have switched to using lmfdb.warwick.ac.uk for knowl editing anyway and we could just tell others to do the same.
The writes must go to warwick over a lovely ssh tunnel, it works to keep the data in sync, but not fast enough for lively edits. This is a temporary solution, and after the release we need to decide what to do about it.
Do you know how the current gunicorn values chosen? Is there reason to believe they are optimal?
We reset a worker after 100 requests or after being silent for 30s. see: http://docs.gunicorn.org/en/stable/settings.html#max-requests http://docs.gunicorn.org/en/stable/settings.html#timeout Not optimal, we could do what @haraldschilly suggested. However, I see this more as a safety feature, as once in awhile a worker goes berserk (uses too much ram/cpu), and I this way we are assured that these workers don't stay around for very long. Further, wanting the workers to store some data for longer doesn't sound more like a fix to me than a solution.
@edgarcosta Agree 100% that we should not be relying on worker instance caching for performance, but there is at least some non-trivial cost to restarting them even without any caching (all the page load times have a bimodal distribution, it's just not as extreme in some cases). My only suggestion is that we could improve the average significantly by increasing the numbers 100 and 30s (I think 30s in particular is way too short and it seems very low risk to increase it, but please feel free to disagree).
Do we have any reason to believe that 100 is a better choice than 1000, say?
No reason at all. I have spent 0 time optimizing these parameters.
@AndrewVSutherland I have just changed some things in the cloud. Would you mind to run the the same batch of tests again? Even though someone is adding stuff to the db at the moment, I expect the connection speed and lattency between the client and the db to be much better.
Will do, running now.
On 2016-05-09 14:01, edgarcosta wrote:
@AndrewVSutherland I have just changed some things in the cloud. Would you mind to run the the same batch of tests again? Even though someone is adding stuff to the db at the moment, I expect the connection speed and lattency between the client and the db to be much better.
You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1247#issuecomment-217940334
@edgarcosta It looks like you trimmed about 20 milliseconds off all the connect times (not much change to the TTFBs, but I assume you weren't expecting much there?). I'll send the detailed stats to you in an e-mail and post a summary.
Thank you!
I wasn't expecting anything amazing, as the ping went from 0.6 milliseconds to 0.3 milliseconds, but overall the network was already quite good. Most important of all, we are not paying for the traffic between the clients and the db.
Here are the updated stats. Note that these are total times that include the connect times. Previously the connect times were tightly clustered around 45-47 milliseconds, now they are tightly clustered around 25-27 milliseconds (which is significant but probably too small to separate from noise in the total timings below).
www.lmfdb.org lmfdb.warwick.ac.uk
page | min | max | ave | dev | min | max | ave | dev |
---|---|---|---|---|---|---|---|---|
L/degree/1 | 0.071 | 0.412 | 0.083 | 0.050 | 0.188 | 0.541 | 0.198 | 0.035 |
L/degree/2 | 0.070 | 0.401 | 0.077 | 0.033 | 0.188 | 0.852 | 0.208 | 0.091 |
L/degree/3 | 0.070 | 1.073 | 0.088 | 0.106 | 0.189 | 0.929 | 0.211 | 0.106 |
L/degree/4 | 0.071 | 5.081 | 0.136 | 0.501 | 0.188 | 0.837 | 0.201 | 0.066 |
zeros/zeta/ | 0.097 | 0.473 | 0.118 | 0.071 | 0.192 | 0.867 | 0.211 | 0.085 |
ModularForm/GL2/Q/holomorphic/ | 0.399 | 0.916 | 0.486 | 0.104 | 0.770 | 2.621 | 0.863 | 0.206 |
ModularForm/GL2/Q/Maass/ | 0.089 | 0.483 | 0.129 | 0.063 | 0.207 | 1.065 | 0.233 | 0.106 |
ModularForm/GL2/TotallyReal/ | 0.111 | 0.179 | 0.134 | 0.018 | 0.292 | 1.137 | 0.322 | 0.111 |
L/degree3/MaassForm/ | 0.090 | 0.110 | 0.094 | 0.003 | 0.220 | 0.638 | 0.233 | 0.045 |
ModularForm/GSp/Q/ | 0.565 | 1.322 | 0.912 | 0.255 | 0.845 | 6.708 | 1.290 | 0.708 |
EllipticCurve/Q/ | 0.117 | 0.760 | 0.318 | 0.184 | 0.295 | 5.801 | 0.511 | 0.752 |
EllipticCurve/ | 0.119 | 1.864 | 0.457 | 0.337 | 0.302 | 4.739 | 0.529 | 0.554 |
Genus2Curve/Q/ | 0.117 | 0.193 | 0.134 | 0.012 | 0.291 | 0.982 | 0.325 | 0.104 |
NumberField/ | 0.119 | 0.164 | 0.133 | 0.009 | 0.297 | 1.503 | 0.335 | 0.145 |
LocalNumberField/ | 0.112 | 0.184 | 0.122 | 0.009 | 0.292 | 1.016 | 0.320 | 0.107 |
Character/ | 0.113 | 0.476 | 0.125 | 0.036 | 0.293 | 1.024 | 0.322 | 0.109 |
Character/Dirichlet/ | 0.113 | 0.146 | 0.118 | 0.004 | 0.293 | 1.757 | 0.323 | 0.158 |
ArtinRepresentation/ | 0.112 | 0.497 | 0.131 | 0.053 | 0.291 | 2.274 | 0.350 | 0.236 |
GaloisGroup/ | 0.106 | 0.450 | 0.117 | 0.034 | 0.288 | 0.994 | 0.305 | 0.070 |
SatoTateGroup/ | 0.115 | 0.493 | 0.133 | 0.037 | 0.296 | 2.203 | 0.362 | 0.238 |
Lattice/ | 0.116 | 0.761 | 0.179 | 0.072 | 0.295 | 1.354 | 0.353 | 0.158 |
After creating an index I think the modular forms main page is faster now, maybe you can check again at some point.
@sehlen running now. Will need to rerun do to database issue. Let me wait until things settle down.
Updated performance stats for the launch. @sehlen Classical modular forms main page is substantially faster.
www.lmfdb.org
page | min | max | ave | dev |
---|---|---|---|---|
L/degree/1 | 0.066 | 0.399 | 0.072 | 0.033 |
L/degree/2 | 0.066 | 0.439 | 0.076 | 0.051 |
L/degree/3 | 0.066 | 0.115 | 0.068 | 0.005 |
L/degree/4 | 0.065 | 0.160 | 0.070 | 0.013 |
zeros/zeta/ | 0.090 | 0.205 | 0.096 | 0.012 |
ModularForm/GL2/Q/holomorphic/ | 0.166 | 0.291 | 0.192 | 0.027 |
ModularForm/GL2/Q/Maass/ | 0.086 | 0.810 | 0.119 | 0.071 |
ModularForm/GL2/TotallyReal/ | 0.103 | 0.157 | 0.120 | 0.015 |
L/degree3/MaassForm/ | 0.083 | 0.224 | 0.092 | 0.020 |
ModularForm/GSp/Q/ | 0.434 | 1.317 | 0.710 | 0.250 |
EllipticCurve/Q/ | 0.109 | 0.406 | 0.213 | 0.134 |
EllipticCurve/ | 0.111 | 0.915 | 0.409 | 0.235 |
Genus2Curve/Q/ | 0.108 | 0.503 | 0.136 | 0.052 |
NumberField/ | 0.110 | 0.561 | 0.132 | 0.054 |
LocalNumberField/ | 0.104 | 0.646 | 0.125 | 0.067 |
Character/ | 0.106 | 0.196 | 0.116 | 0.011 |
Character/Dirichlet/ | 0.105 | 0.492 | 0.119 | 0.053 |
ArtinRepresentation/ | 0.105 | 0.562 | 0.123 | 0.052 |
GaloisGroup/ | 0.099 | 0.443 | 0.111 | 0.034 |
SatoTateGroup/ | 0.110 | 0.326 | 0.128 | 0.023 |
Lattice/ | 0.108 | 0.609 | 0.176 | 0.060 |
As a minor boon I note that running all our tests (./test.sh) is faster than it used to be (approx 5 mins --> 4 mins).
On 10 May 2016 at 01:52, AndrewVSutherland notifications@github.com wrote:
Updated performance stats for the launch. @sehlen https://github.com/sehlen Classical modular forms main page is substantially faster.
www.lmfdb.org
page min max ave dev L/degree/1 0.066 0.399 0.072 0.033 L/degree/2 0.066 0.439 0.076 0.051 L/degree/3 0.066 0.115 0.068 0.005 L/degree/4 0.065 0.160 0.070 0.013 zeros/zeta/ 0.090 0.205 0.096 0.012 ModularForm/GL2/Q/holomorphic/ 0.166 0.291 0.192 0.027 ModularForm/GL2/Q/Maass/ 0.086 0.810 0.119 0.071 ModularForm/GL2/TotallyReal/ 0.103 0.157 0.120 0.015 L/degree3/MaassForm/ 0.083 0.224 0.092 0.020 ModularForm/GSp/Q/ 0.434 1.317 0.710 0.250 EllipticCurve/Q/ 0.109 0.406 0.213 0.134 EllipticCurve/ 0.111 0.915 0.409 0.235 Genus2Curve/Q/ 0.108 0.503 0.136 0.052 NumberField/ 0.110 0.561 0.132 0.054 LocalNumberField/ 0.104 0.646 0.125 0.067 Character/ 0.106 0.196 0.116 0.011 Character/Dirichlet/ 0.105 0.492 0.119 0.053 ArtinRepresentation/ 0.105 0.562 0.123 0.052 GaloisGroup/ 0.099 0.443 0.111 0.034 SatoTateGroup/ 0.110 0.326 0.128 0.023 Lattice/ 0.108 0.609 0.176 0.060
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1247#issuecomment-218096879
I ran through the google analytic logs pulling out the slowest pages that were being accessed regularly by our users: here are some of the worst offenders, 2 of which should be addressed by the indexes Edgar is adding tonight. I plan to take a look at the rest tomorrow to see if there are any easy fixes (e.g. adding an index):
http://www.lmfdb.org/NumberField/?ram_primes=11 (+ many similar queries with different primes) http://www.lmfdb.org/L/ArtinRepresentation/2.163.8t12.1c2/ http://www.lmfdb.org/EllipticCurve/Q/?count=100 (should be faster tomorrow) http://www.lmfdb.org/EllipticCurve/browse/ (should be faster tomorrow EDIT: no it should not, my mistake, I was thinking of http://www.lmfdb.org/EllipticCurve/?count=100) http://www.lmfdb.org/EllipticCurve/browse/6/ http://www.lmfdb.org/EllipticCurve/Q/stats
There is also http://www.lmfdb.org/EllipticCurve/6.6.371293.1/79.1/c/4, but that is already listed as a separate issue (https://github.com/LMFDB/lmfdb/issues/1358).
For the Artin L-function, it looks like the corresponding Artin representation page loads quickly. The Artin L-function code will be revised when the L-function data goes in the database.
lmfdb.org is now running with the new indexes.
On the number field search, I created an index on warwick for just ramified primes. I guess you can test timings on beta vs the main site.
@jwj61 Do you have any idea why is http://www.lmfdb.org/NumberField/?degree=2 so slow?
The indexes are all there, but the following command still takes forever:
2016-05-18T01:52:17.209+0000 I COMMAND [conn345] command numberfields.fields command: count { count: "fields", query: { degree: 2 } } planSummary: COUNT_SCAN { degree: 1, ramps: 1 } keyUpdates:0 writeConflicts:0 numYields:28285 reslen:62 locks:{ Global: { acquireCount: { r: 56572 } }, MMAPV1Journal: { acquireCount: { r: 28286 } }, Database: { acquireCount: { r: 28286 } }, Collection: { acquireCount: { R: 28286 } } } protocol:op_query 3094ms
@AndrewVSutherland These two http://www.lmfdb.org/EllipticCurve/browse/ http://www.lmfdb.org/EllipticCurve/browse/6/ didn't get faster.
Both of them do a lot of calls to the db, none of the calls takes forever, however a bunch of them still take enough time to show up in the logs: 2 requests for EllipticCurve/browse/ 2 requests for EllipticCurve/browse/6
@jwj61 I don't know why http://www.lmfdb.org/NumberField/?degree=2 is slow. Edgar and I looked at this earlier today, and I can tell you that a large amount of the time is in the call fields.find({'degree':int(2)), and as you said there is an index on degree. This should be faster.
The index you added on the field 'ramps' definitely helped, compare: http://beta.lmfdb.org/NumberField/?ram_primes=11 (about 4.6s) http://www.lmfdb.org/NumberField/?ram_primes=11 (about 18.5s)
But as we've found elsewhere (e.g. ainvs and torsion_structure), mongo db does not handle indexes on array fields very well. At least for exact matches, I suspect it would be a lot faster if ramps was just a string encoding an array (for inexact matches it might be slower, although you could do a regex query on the string, I'm not sure how the two would compare, probably worth testing).
@edgarcosta Yeah, the two you mentioned both trigger the computation of a whole lot of statistics (I was confused about /browse, see my edit above) The best way to speed this up is to precompute the statistics, which I know @JohnCremona is planning to do. I don't see an easy way to speed up the queries you noted; 'conductor_name', 'degree' and 'field_label' are all already indexed (storing the prefix of the field label separately rather than using a regex query would speed some of them up, but there is no point to making this change since precomputing the states will make it unnecessary and both require updating all the records in nfcurves).
On the number field search, I created an index on warwick for just ramified primes. I guess you can test timings on beta vs the main site.
Now also on lmfdb.org
@AndrewVSutherland I agree that the solution is precomputation
For the ramification situation, I am aware of the indexing situation, but this is a case where we often make use of its flexibility, since we allow searches where certain primes are in the set of ramifying primes (but not all of it), or not in it (the unramified search field).
On degree 2, the issue may be that 600,000 search results is the problem? It is worth checking that the sort matches an index as well, but my recollection is that it does match.
I think I understand it a bit better, each query can only use one index, and in this case it is using the index { degree: 1, ramps: 1 }, and perhaps ramps being a multikey index is slowing it down.
The search with ?ram_primes=11 was not specifying the degree, so the index you mentioned was not used. So, earlier today I made an index for just ramps, which sped it up considerably.
If it is considered a priority, we could speed up searches which specify all ramified primes but adding another entry to the database, but there is not much we can do about searches which involve some ramified primes or specify unramified primes.
@jwj61 I was talking about: http://www.lmfdb.org/NumberField/?degree=2, this one is using the index { degree: 1, ramps: 1 }, and I don't really understand why.
The way I understand indeces is as follows: internally, they are essentially a sorted list of pointers to database records. (The sort may be stored as something like a binary tree for faster traversing, but it is still a sort.)
If you make a compound index, say first degree and then signature, then it sorts by degree, and breaks ties by signature. The database should then not need an index for specifically on degree because it can use the compound index. If you want a sort by signature in this example, you need a separate index for that. So a sort on (A, B, C, D) should automatically work for (A), (A,B), (A,B,C), and (A, B, C, D).
So, we should not need a number field index on just degree -- we have several which start with degree. In fact, we do have an index on degree. In spite of this, it looks like it picks (degree, ramps) from the options, and maybe the inclusion of ramps (which is a list) is slowing it down.
In the end, my answer is "I don't know" as to why it picks this compound index over just "degree". It seems like a bug/inefficiency in mongo.
@jwj61, @edgarcosta Checkout http://www-central3.lmfdb.xyz/NumberField/?degree=2
I just deleted the index degree_1_ramps_1 and recreated it. And shazam, it's under 0.5s.
Yet another reason to consider rebuilding all the indexes (issue #1294)
And @jwj61 I agree with your analysis, mongodb is doing something here that doesn't really make sense, and it really should favor the index degree_1. I agree it could use any of the multikey indexes that have degree as the first field, but it ought to prefer the index with the fewest fields (the index pages are going to be smaller, not a huge deal but it will in general be slightly faster -- BTW I expect it is using a b-tree internally (like a binary tree but each node has many children not just 2), as this would be the standard thing to do and is much cheaper to update than a binary tree.
I'm going to try deleting the degree_1_ramps_1 key on atkin and see what happens.
Dropping the index sped things up substantially, try http://beta.lmfdb.org/NumberField/?degree=2.
@jwj61 shall I add the index back in (this may make the numberfields.fields collection unavailable for a few minutes on atkin)
Sure.
In progress now, best not to access number field pages on beta.lmfdb.org for a few minutes (it may force the lmfdb client to disconnect because of a timeout)
Done and still fast even with the index back.
So the issue here may have more to do with the state of the indexes than anything else. @edgarcosta we should come up with a plan for https://github.com/LMFDB/lmfdb/issues/1294 and try it out on atkin first to get a sense of how long it will take.
One could argue that it would be worth waiting until we have a documented manifest of all the indexes that should be present (about to raise a separate issue for this), but that is going to take time and I don't think we should wait for that.
@AndrewVSutherland tell me what to do on the DB and I will take care of it.
@jwj61 (regarding https://github.com/LMFDB/lmfdb/issues/1247#issuecomment-219923826) Looking through the google analytics pages for the last week, among the queries with ram_primes non-empty, it looks like ram_quantifier=all was set on about twice as many page views as ram_quantifier=some.
There were only a few hundred in total, so I'm not sure how good this sample is, but it's clear that ram_quantifier=all is a pretty common use case and probably worth optimizing (I would store the whole list as a string ("[2,3,11]", say) with no white space, convert the input to the same form (primes sorted)) and then do an exact match on a single key index.
@edgarcosta In the immediate term, rebuild the index degree_1_ramps_1 on numberfields.fields. In the slightly longer term, we should test (e.g. on ms1) how long it will take to rebuild the indexes on every collection in every database (and verify everything works correctly when we are done). If we conclude that is a reasonable thing to do, then we could do the same thing on ms (swapping out with the replica of course).
@jwj61 What you said is very much in line with https://docs.mongodb.com/manual/tutorial/sort-results-with-indexes/#sort-and-index-prefix
I think what happened here is that it had (A, B) and (A, C) indexes, and it prefered (A,B) vs (A,C) for some reason that I don't understand. Perhaps by the order it showed up?
@AndrewVSutherland To be clear. You want me to (1) or (2):
I think they aren't equivalent.
I was assuming (1), but in fact I did (2) (which would be a pain to do everywhere). How are they inequivalent?
On May 18, 2016 2:49:01 PM EDT, edgarcosta notifications@github.com wrote:
@AndrewVSutherland To be clear. You want me to (1) or (2):
- db.fields.reIndex()
- db.fields.dropIndex({ 'degree': 1, 'ramps': 1 }; db.fields.createIndex({ 'degree': 1, 'ramps': 1 });
I think they aren't equivalent.
You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1247#issuecomment-220122223
Sent from my Android device with K-9 Mail. Please excuse my brevity.
(2) puts the index { 'degree': 1, 'ramps': 1 } at the end of all possible indexes. I'm afraid that it makes a difference.
I will try to prove my point by doing (1) and then doing (2) on ms.lmfdb.xyz.
After (1) it went from ~4.5s to ~3s
then I did (2) and it is now at 0.5s
@edgarcosta OK, this is unpleasant. Can you tell me (I'm away from my terminal at the moment), is the single key index on degree now the first among all indexes that start with degree? If not, I'd be curious to see what happens if you destroy and recreate the ones ahead of it.
Now { 'degree': 1, 'ramps': 1 } is at the end of IndexKeys array, see: https://gist.github.com/edgarcosta/2c4a5cd65ed8f71f7f701ad6017da962
Right, and it is now very snappy (400ms on my machine at MIT). I notice that {'degree':1, 'discriminant':1} is ahead of {'degree':1}; can you tell which index it is using?
It is using: {'degree':1, 'discriminant':1} see: https://gist.github.com/edgarcosta/68405f6015e131f8f221122685eca1ae
As you would have predicted. Can you try dropping and adding {'degree':1, 'discriminant':1}
On 2016-05-18 16:27, edgarcosta wrote:
It is using: {'degree':1, 'discriminant':1} see: https://gist.github.com/edgarcosta/68405f6015e131f8f221122685eca1ae
You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1247#issuecomment-220147791
_1 Upvote_ This is a long discussion that probably should not be in the description of an "issue", but I want to put it where everyone can see it -- we can move it elsewhere later. If you are not interested in the performance of the LMFDB you don't need to read it, but I would argue that anyone who cares about the success of the LMFDB should be interested in performance, as it can have a huge impact on user perception. It contains a markup table that is best viewed on github at https://github.com/LMFDB/lmfdb/issues/1247.
As we prepare to go live with Release 1.0 we need to think about how to monitor performance and improve it where needed. I know there are server side tools to help with this, but we need to configure and figure out how to best use them. I also think it is important to monitor the user experience from the client side, because this may turn up issues that are not always evident on the server side.
This morning I ran a modified version a curl script created by Edgar Costa, using it to measure load times of all the top level pages of the LMFDB (anything you can get to with one click on the sidebar). I ran it from MIT and measured the time to get the first byte of the response (TTFB). These times are thus essentially lower bounds on response times seen by the user, since they are running over a very fast internet connection close to the backbone, and they do not include any time spent by the broswer rendering the returned page. I measured times both to the cloud server (www.lmfdb.org) and Warwick (lmfdb.warwick.ac.uk) over 100 consecutive requests (arguably it would be better to space them out over time, but I did not take the time to do this, I was in a hurry to get the results).
(discussion continued below the table)
Ideally we would like the mean response time for all our pages below 400ms -- this is the generally accepted threshhold where users first begin to change there website usage based on perceived performance (Google and Akamai have done a lot of research on this), and anything over 2 seconds represents a place where an unmotivated user may well give up (something like 20-40 percent do). In my initial test the time for EllipticCurve/Q jumped out as being remarkably slower than the others (which agreed with my own personal experience). After some digging I was eventually able to track down the root cause (a missing index on an attribute that was being used to count isogeny classes of elliptic curves in ec_stats.py). Adding this index dramatically improved performance. I suspect there are several other "easy wins" to be had, and my purpose in raising this issue is to get people to think about them.
We should of course also be thinking about the performance of pages below the top level, and the web server analytics should give us coverage data that will tell us which pages we should focus on (this is the other part of this issue, configuring and using analytics).
Another thing I noticed during these experiments is that in cases where computations are cached (e.g. statistics computed in elliptic_curves, genus2_curves, and elsewhere), the response times have a bimodal distribution -- the first access by a given worker thread is slow, but later access that can reuse the cached data are much faster (as much as a 5 to 1 or 10 to 1 difference). You might think this means that once the software is up and runnign for a while every worker thread will be using cached data and respond quickly, but this is clearly not the case. If you ran many copies of the page fetch script in parallel for a high number of iterations, it will quickly reach a steady state where all the responses are quick. But if you then stop and do a handful of requests 10-15 minutes later you have a good chance of seeing slow response again. I don't know why this is happening (it may have to do with worker threads being dynamically created and killed to adjust to dynamic load), but it would be good to figure out.
I realize that I have touched on a number of issues that should probably be raised individually, but I wanted to do a brain dump and put it out there so that others can think about these issues and respond.
Here http://math.mit.edu/~drew/lmfdb_timings.sh is the curl script I used, in case anyone else wants to run it for their location (it separates out connect time, so you can see how much of the difference is due to the time to reach the server, versus the time it spends responding).