Anthony-Nolan / Atlas

A free & open-source Donor Search Algorithm Service
GNU General Public License v3.0
9 stars 5 forks source link

Profile Atlas in Azure #1208

Closed zabeen closed 5 months ago

zabeen commented 7 months ago

https://learn.microsoft.com/en-us/azure/azure-monitor/profiler/profiler-overview

See notes from previous attempt: https://github.com/Anthony-Nolan/Atlas/issues/1119

Previous attempt to profile atlas failed because A.I. profiler can only run on app service plan - not any functions plans.

1) Create a new spike branch from git tag stable/1.6.1 which is current stable Atlas version. 2) On spike branch, make necessary terraform script changes to convert elastic plans to app service plans. 3) Enable profiler on both matching and <env>-atlas-functions apps, as described in #1119. 4) Deploy spike branch to dev-atlas and check the plans have updated as required. 5) Run test searches with match prediction ON, and see if profiler captures data.

If test successful, then we can discuss with wmda about deploying the spike to UAT-WMDA-ATLAS, and running performance tests there, as the data there is similar to live-atlas-wmda.

SergeyEzh commented 6 months ago

@zabeen, I was able to successfully apply terraform configuration for profiling. New functions were deployed on app service plan and regular ones were disabled. I manually stated profiling session and got some results:

Image

Also it has showed some recommendations:

Image Don't be confused: 'dev-atlas-matching-algorithm-functio...' it's 'dev-atlas-matching-algorithm-functions-temp'. It isn't recommendations from 'before profiling' time. In the first deployment, profiled functions had 'temp' in the end of their names. But they had some ID conflict because it was generated by truncation of their names and with 'temp' at the end they were the same.

However, it couple things confuse me:

Image

Code were working most of the time is system's or azure's code. I suppose this is because profiler session relative long (7-20 mins) and I was sending search requests manually from time to time and profiler doesn't exclude all service code from profiling. I think it supposed to run profiler sessions on some load to get adequate results.

Results are still there in azure, you can look at them