Geta / geta-notfoundhandler

The popular NotFound handler for ASP.NET Core and Optimizely, enabling better control over your 404 page in addition to allowing redirects for old URLs that no longer works.
Apache License 2.0
19 stars 15 forks source link

CustomRedirectCollection hogs too much memory resources #92

Open giangnb opened 1 year ago

giangnb commented 1 year ago

The CustomRedirectCollection contains all configured redirects and takes up a great amount of memory resources if there are lots of rules. In some extreme case, it causes site down.ZACache image

All redirects are stored in CustomRedirectCollection._quickLookupTable. And the operation to populate _redirectsZACache worsen the problem: https://github.com/Geta/geta-notfoundhandler/blob/ca1f3730af6cdbdd797cc2c0d541143216e1acc5/src/Geta.NotFoundHandler/Core/Redirects/CustomRedirectCollection.cs#L104-L109

marisks commented 1 year ago

Looking at the screenshot and it takes 73 Mb which is not much. How many redirects do you have? What error do you get when your site crashes?

giangnb commented 1 year ago

The problem, you can see the mentioned Dictionary objects are in singleton IRedirectHandler. Note that, Dictionary are not thread-safe and whenever a CMS editor adds something to the _quickLookupTable, it would cause a deadlock as there are multiple threads are trying to read and write to the same object. That's the root issue I'm trying to say.

marisks commented 1 year ago

Do you really have deadlocks or is it just a guess?

giangnb commented 1 year ago

That's a suspect as many sites are suffering from the exact same pattern. Where memory is high and request pool is stalled. Have you tested this edge case, when someone changes the redirect rules while there're still lots of other requests at other sessions?

marisks commented 1 year ago

No, I haven't tested such a case.

But I am trying to understand your issue. Do you have a memory dump where you see that the NotFound handler takes a lot of memory (not just 70 MB)? Do you have any proof that there is a deadlock in NotFound handler? Are all sites that suffer from it based on the same codebase? Could it be something else and not NotFound handler?

From what I see in the code - not using concurrency stuff, but just using a Dictionary may cause data corruption in that dictionary. Maybe also cause random modified collection exceptions. But I do not see how it might cause deadlocks as there are no locks. Maybe I am wrong. This code that handles 404 requests in a Dictionary is in this project for 8-10 years and we haven't heard about such issues before and haven't experienced them ourselves. There might be an issue, but we need more data - real data, not just guesses to understand what is happening.