SDKits / ExamineX

Issue tracker for ExamineX
https://examinex.online
5 stars 0 forks source link

Performance issues when upgrading to 2.0.10 #80

Closed daniel-eriksson closed 4 months ago

daniel-eriksson commented 1 year ago

We have done an update from version 1.3.2 to v. 2.0.10 and are experiencing slow server response times. By reverting back to the V1 version it seems a bit better although then we are not able to rebuild our indexes with the older version... Would it be possible to add the "data source" fix in a minor version upgrade in V1?

Shazwazza commented 1 year ago

Hi, can you elaborate on what slow server response times means? For example, what part of ExamineX do you believe to be slower? (searching, indexing, etc...).

There aren't a lot of changes in logic between these 2 versions, only major version upgrades to some underlying dependencies such as the Azure Search library.

I'll look into releasing a patch release for v1 but since that is using a legacy/obsolete version of Azure Search itself, it is recommended to not keep using that version.

daniel-eriksson commented 1 year ago

Hi, sure. Indexing feels pretty fast, querying also looks pretty fast when looking at the request in Application Insights but for some reason the server response time is slow.

In our previous environment running Umbraco 7 we had no such problems, but then we had all the other problems related to Lucene/Examine in a load balanced environment. So when we updated to Umbraco 8 a couple of months ago and started using ExamineX (v.1.3.2) all that was resolved, but instead we had really slow responses from querying our index. We resolved that by setting a 5 min cache on these queries which had a big impact on our loading times. But now when upgrading to the 2.0.10 version, the cache don't seem to help any longer and it even seems to affect the server response time on pages not using the index? I rolled back to v 1.3.2 yesterday and the response times got better again (still not good though...).

When comparing a request to a page using the index the 1.3.2 version vs the 2.0.10 version it seems the later version is doing duplicate requests, both to the "GetIndex" but also the "POST /indexes/*". We also saw some "503" exceptions when using the 2.0.10 version, but these I think we resolved by caching the "searcher".

Shazwazza commented 1 year ago

Hi, thanks for the info. Which server response times are slow? From the browser to Umbraco, or from Umbraco to Azure Search?

As mentioned there aren't very many logical changes between these two versions so I'm unsure why there would be a performance change.

it even seems to affect the server response time on pages not using the index?

If that is the case, ExamineX isn't doing anything to impact any performance.

but these I think we resolved by caching the "searcher".

What do you mean by this? Caching the searcher object will have no effect, it is already a singleton.

My advise would be to run a profiler so you can be sure you know where the bottleneck actually is. Without that information, I'm not sure where to go from here especially if your pages are slower that aren't even using ExamineX. Happy to help where possible but I think a profiler is what is needed here.

daniel-eriksson commented 1 year ago

Hi, by server response time I basically mean the time before the page starts rendering, the time displayed in Chrome web inspector for the current url. But also the average response time in Application Insights for the urls using Examine.

Ok, I guess we could remove the caching if it’s already a singleton. This was a quick fix to see if it would not request the indexer multiple times per request.

Yeah, I need to do some more debugging/profiling to see what’s really going on here. 

I noticed we have the Azure Search Location set up in “Germany West Central” but all other services in “West Europe”. I think this was because that location was not selectable when setting up the Azure Search resource for some reason. I’ve read mixed locations can be negative for performance, do you think this might have an impact?

Anyhow the site runs smoother with the 1.3.2 version so would highly appreciate a patch release for v1 to use while investigating. As you can see in the attached screenshot from AI our average response times got crazy when releasing the v2.0.10 version 28/4.

response-times-last-30-days

Shazwazza commented 1 year ago

I've published a 1.3.3 version to Nuget with the data source fix in it.

Let me know when you get more debugging information or if you get further along in your investigation.

daniel-eriksson commented 1 year ago

Hi, and thank you for the patch!

I'm not very experienced in profiling server web performance, but perhaps you can identify something obvious here. I conducted the profiling using the new 1.3.3 version because I believe the problem exists in this version as well, although it's not as pronounced. So, you're probably right that it's not really a problem with version 2, but rather something within our application.

The screenshots from DotTrace seem to indicate that Azure Search is slow for some reason. I've also attached a screenshot showing some exceptions that seem to be occurring. Could it be that there is an incorrectly formatted query on our side, which is throwing an exception in Azure Search and causing the slow response? Alternatively, do you have any other ideas about what might be causing this?

AzSearch

profiling-examinex-v1-3-3 exceptions

Shazwazza commented 1 year ago

Hi, right I think I might know what could be occurring. ExamineX has a retry policy for using the Azure Search APIs in accordance with its docs. However, in some cases if a filtering occurs when indexing content making the index operation empty, it will result in an error and then ExamineX will retry - even though its empty. A similar thing could occur during a search if an invalid search is generated. ExamineX shouldn't double search either when getting the same search result and then also resolving the total Item count, but based on your screenshot there's some overhead with total item count too (could be the same thing I mentioned above). Do you actively resolve both the total item count and the results? Do you do this in different searches or use the same search result instance?

daniel-eriksson commented 1 year ago

Hi. Yeah the retry policy do sound like a potential cause for the slow response. To identify the issue, I need to investigate what might be wrong with our queries that triggers the retries. Although the queries do return valid results, so debugging might become a bit challenging.
.. Any idea how I would identify the error in my queries?

I’m resolving the Total Count using the same searcher instance. Here's an example:

var result = criteria.OrderByDescending(orders.ToArray()).Execute(maxResult); 

var total = result.TotalItemCount;


In the code above, criteria is of type IBooleanOperation, and orders is a List<SortableField> 
 
FYI: On some of our pages we do two searches and then merge the result to be able to sort the two result sets differently. This is probably the reason why it looks this way in my screenshots.

Shazwazza commented 4 months ago

Hi @daniel-eriksson, there's been several new versions of ExamineX shipped. Some of them addressing retry issues. Since I haven't heard back I'll go ahead and close this issue. Also regarding performance, please be sure to read this from Microsoft regarding field counts https://learn.microsoft.com/en-us/azure/search/search-limits-quotas-capacity#index-limits

2 The upper limit on fields includes both first-level fields and nested subfields in a complex collection. For example, if an index contains 15 fields and has two complex collections with five subfields each, the field count of your index is 25. Indexes with a very large fields collection can be slow. Limit fields and attributes to just those you need, and run indexing and query test to ensure performance is acceptable.

You can filter out fields that go into indexes using the TransformingIndexValue event, an example is here https://shazwazza.com/post/filtering-fields-dynamically-with-examine/