Closed brent-hartwig closed 2 weeks ago
FYI @roamye, @clarkepeterf, @gigamorph
Based on a team conversation today, the decision on whether to implement this ticket is blocked on upcoming discussions on how to perform a performance test. cc: @jffcamp, @prowns, @roamye
@brent-hartwig - this is blocked by the discussion or something else? this also needs to go to prioritization review as it skipped over the uat process.
@roamye, it is blocked pending a discussion / decision. My vote is to just close this ticket. Only after we figure out how the performance test can be executed to better represent production (e.g., more varied data with middle tier cache enabled) will we be able to identify and prioritize bottlenecks. Those bottlenecks may or may not be removed by changing how the middle tier utilizes the two backend app servers. Further due to facet request pagination, ML 11.3 improvements, and that the backend is never pushed as hard as it is during a performance test, we may be able to go back to one app server, which would simplify the environment a little.
cc: @clarkepeterf, @jffcamp, @prowns
UAT 8/26: This will be brought up in the performance test discussion this Wednesday. Will determine if this is closed/opened then.
@prowns to add to agenda.
@prowns -
I looked over the performance test discussion and this was not part of the agenda. It is unclear whether this ticket should remain open or closed. Should this be brought up in the IT Team Meeting to discuss?
I propose close based on our intent to revamp the performance test and only then seek out bottlenecks. We can reopen this ticket if it turns out to be one of the bottlenecks.
I concur. with @brent-hartwig
As do I
Jeffrey Campbell Pronouns: he/him/his
Phone: 203-432-8554 Cell: 475-201-5873
From: Peter Clarke @.> Sent: Wednesday, September 11, 2024 3:40 PM To: project-lux/lux-middletier @.> Cc: Campbell, Jeffrey @.>; Mention @.> Subject: Re: [project-lux/lux-middletier] Improve backend request distribution (Issue #67)
I concur. with @brent-hartwighttps://github.com/brent-hartwig
- Reply to this email directly, view it on GitHubhttps://github.com/project-lux/lux-middletier/issues/67#issuecomment-2344560579, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A4H4TPFH42A5AJRVLCDNKTLZWCMCFAVCNFSM6AAAAABJE5EZ56VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBUGU3DANJXHE. You are receiving this because you were mentioned.Message ID: @.**@.>>
Approved by UAT 9/12 to close.
Problem Description: During performance tests, one of two application server request queues can be full for a sustained period, likely explain thousands of 504 responses per second from the MarkLogic load balancer which we believe the data service proxies are retrying. We believe the data service proxies are mostly successful since the web cache load balancer peaked around 100 per second. If we shift redirect some requests from the application server whose queue fills up to the one that doesn't, the stack may be able to service more requests in a shorter period of time and have fewer 504s to process.
During 7 Jun 24's performance test, one queue was full 82% of the time the metric was recorded:
Expected Behavior/Solution: Increase traffic to the apparently underutilized MarkLogic application server. How best to do this is yet to be determined. Ideas and considerations thought of thus far (not intended to be mutually-exclusive):
Note we have tried nos. 3 and 4 before, specifically scenarios K, M, N, and P. We did not end up selecting those configuration; however, much has changed since then.
Current configuration (Scenario J):
lux-request-group-1
on port 8003: The middle tier is expected to send all requests here exceptsearch
andrelatedList
requests. Maximum of 6 concurrent requests.lux-request-group-2
on port 8004: The middle tier is expected to send allsearch
andrelatedList
requests to this application server. Maximum of 12 concurrent requests.Breakdown of request by type and duration from the 7 Jun 24 performance test (https://github.com/project-lux/lux-marklogic/issues/162):
Note there were fewer than expected backend requests during the 7 Jun 24 performance test, and we may not yet know why.
Requirements: See above.
Needed for promotion: If an item on the list is not needed, it should be crossed off but not removed.
~- [ ] Wireframe/Mockup - Mike~
UAT/LUX Examples: All endpoints should be tested to ensure they reach the intended backend application server. Performance test recommended as well.
Dependencies/Blocks: This issue is neither dependent on nor blocking another.
Related Github Issues: https://github.com/project-lux/lux-marklogic/issues/162
Related links: None.
Wireframe/Mockup: Not needed.