Open Piedone opened 4 years ago
Might be a config option to preload instead of doing it by default. We'd also need to lock on the resolution for a specific url (gating) to prevent bursts of SQL queries on startup, for the homepage for instance. It might also be an MRU collection in the case of lazy loading so we don't store routes that are rarely used.
Would you mind creating a site with thousands or autoroutes and measure startup? I assume the simplest way is with a SQL query, all autoroutes pointing to the same action.
Also, AutorouteEntries is not thread-safe
I agree, it is like this from the beginning, refactored in my distributedCache
branch where we don't resolve autoroutes on startup but on the first need (e.g. not the home), but still in one go when we first create the related cache, a volatile (only cached) distributed document.
Hmm, if we populate the document lazily, if there is no entry matching a given request path, how could we know that the entry doesn't exist or is not yet populated, without always doing a query?
We'd also need to lock on the resolution for a specific url (gating) to prevent bursts of SQL queries
How can we do this if on another side we don't want to use custom locking sytems? E.g. as in my old distributed
branch where there is an ILock
service providing tenant level named locks, and with a distributed implementation based on Redis.
It might also be an MRU collection in the case of lazy loading so we don't store routes that are rarely used.
Good idea
I've tested the impact with this query on SQLite:
WITH RECURSIVE
for(i) AS (VALUES(1) UNION ALL SELECT i+1 FROM for WHERE i < count)
INSERT INTO AutoroutePartIndex (DocumentId, ContentItemId, Path, Published, Latest) SELECT i, '4xkqkmeg8zx2y5r9jxensthvmx', 'home-page', 1, 1 FROM for
And this one in a local SQL Server instance:
DECLARE @i int = 0
WHILE @i < count
BEGIN
SET @i = @i + 1
INSERT INTO AutoroutePartIndex (DocumentId, ContentItemId, Path, Published, Latest) VALUES (26, '4xkqkmeg8zx2y5r9jxensthvmx', 'home-page', 1, 1);
END
Results for SQLite, rough averages:
For SQL Server:
I don't know how these would change if actual content items would be present. Perhaps there would be some difference due to the foreign keys all pointing to different documents.
So if you have a few items or a local DB this is fast. I'd imagine with a remote DB these go up significantly.
Hmm, if we populate the document lazily, if there is no entry matching a given request path, how could we know that the entry doesn't exist or is not yet populated, without always doing a query?
We'd cache negative results too. So first lookup request for a path comes, we look it up in SQL. Found? If yes, put the AutorouteEntry
into the cache. Not found? Put an object representing "not found" into the cache (maybe a private subclass of AutorouteEntry
just for this purpose?). On the next request we'll read the same result form the cache. If the content changes then the cache will be updated, but this happens already.
Your numbers seem to disprove the issue.
However there might be cases when we want one or the other. Not sure it's worth implementing a different logic right now, as even with 1M items the startup is totally reasonable, and adding lazy loading means
I am not too worried about these issues because there are other ways in the system to force queries or allocate things, and this should not take the app down, just slow it down.
Maybe we could at least abstract this part if someone needs to implement some custom logic.
I think it's great to have some numbers of specific loads. Next we need to do the same with 1M content items ;) And I assume that the way the index tables are used is really paying off.
We'd cache negative results too. So first lookup request for a path comes, we look it up in SQL. Found? If yes, put the AutorouteEntry into the cache. Not found? Put an object representing "not found" into the cache (maybe a private subclass of AutorouteEntry just for this purpose?). On the next request we'll read the same result form the cache. If the content changes then the cache will be updated, but this happens already.
Good idea but the problem is that we map on the following, so the autoroute transformer is executed e.g. for all static assets, so we would have many negative results
routes.MapDynamicControllerRoute<AutoRouteTransformer>("/{any}/{**slug}");
I was curious whether this changes significantly in an actual production-like environment with a remote DB, so I've made some tests on Azure: 50 DTU (S2) standard tier SQL DB, B1 Web App.
So there seems to be a soft limit at 4M records (not that there are no corresponding content items, so the actual number is likely lower) during realistic scenarios without further tweaking.
However, there are some larger first request times (not sure how to call it): I didn't want to include the first startup after adding new rows because I thought that that's unrealistic for a production app (only added for the last three when I noticed it might mean something). However, I'm not actually sure what's the difference between this first request and the subsequent ones. I'd guess SQL Server does some caching but I don't know when this cache is invalidated. So as a worst-case scenario I also included these numbers in parentheses: Maybe this is the number you'd actually see more in production, maybe the other one. Interestingly, for 1M rows there was no difference (perhaps it's too large for a cache).
Note that these numbers (as the yesterday ones) are with zero load apart from the measurements. I also tried tenant restarts while the DB was around 750% DTU and got 30-60s with around 3M records.
Anyway, I stand corrected, contrary to how it looks like this is not an issue for quite a while. While technically the current implementation won't scale it's still good enough for even a large site unless on significantly worse hardware/network than the one I've used in Azure.
I don't have a good idea for your remark @jtkech .
Cool, interesting to have some actual performance figures.
For infos, it is trivial to use some of the Api
tests to create and connect to a sql database, and create content items ad nauseam.
The integration tests currently use SqLite to create multiple tenant database, and you can then take that database from the Tests/bin folder and move it to App_Data
and load it as an actual tenant.
Or connect the tests directly to a local or remote Sql Server and create multiple tenants with a million content items
I did some of this when testing recipes, and could push some testing code to a branch if you are interested in a more real life test.
It would certainly be interesting to see. Let me know if you have the time to test them, and I will push a branch.
Here's a link to one of the existing tests https://github.com/OrchardCMS/OrchardCore/blob/dev/test/OrchardCore.Tests/Apis/ContentManagement/DeploymentPlans/ContentStepLuceneQueryTests.cs
In terms of the actual numbers you have already tested a couple of thoughts.
But those startup times are still pretty reasonable.
I would be interested to see how Startup performs when it is recieving +500 request / minute, during site recycle. This is an area that has caused us issues with ASP.NET Core, but not something I can test with Orchard Core.
I don't have more time to play with this at the moment but thanks for the info.
Should we close this issue?
Let's keep it open. It is an issue, just not an immediate one and not for everybody.
BTW I'm less concerned about the time a startup takes but more about potential timeouts. Depending on your configuration this can reach a level where the tenant is permanently stuck in a startup timeout, failing to start up at all. On Azure the DB connection timeout is 30s (which I'd argue is actually too high for a web app, you really want incorrectly slow queries to fail faster) which is around the measurement of the 4M threshold. Eventually somebody will bump into this.
The
Startup
class of Autoroute fetches all the indices in a single query on shell start:While this is no problem for small sites it'll become an issue with hundreds (and especially tens of thousands and more) content items using AutoroutePart. Then all those items' index records will be retrieved in one go. This can be even reached with a blog where you post periodically and especially with something like a news site (or just a large site with many content pages). This won't scale.
What we can do instead e.g. is to make
AutorouteAliasProvider
look up aliases lazily but caching results (even in aConcurrentDictionary
but more so in proper, even distributed cache like @jtkech is working on). This way while each lookup would incur a performance penalty of an SQL query first it would support an unlimited number of content items (available memory for caching permitting but the cache can manage that), have no impact on startup performance (or even eventually cause DB timeouts) and still provide the same performance for repeated lookups (which should cover all the popular content items).On a related note I think the current implementation will break in a multi-server setup too, if multiple nodes are creating content items.
Also,
AutorouteEntries
is not thread-safe: It uses a non-concurrentDictionary
for caching, and while it synchronizes writes to it, it doesn't do the same for reads (concurrent reads are only fine while the dictionary is not modified at the same time).