CirroePlatform / Hermes

A search engine for datasets. Built using MongoDB atlas vector search, openai embeddings, and the huggingface api.
0 stars 0 forks source link

Seeing application slowdown in response times during peak hours #25

Closed AbhigyaWangoo closed 3 hours ago

AbhigyaWangoo commented 3 hours ago

During peak hours, our latencies spike up 2-3x the normal amount. We only started to notice this issue around October 20th, and it has been consistently happening since then. Can someone help

cirroe-bot commented 3 hours ago

Checked and verified that we were suffering from overloaded CPU utilizations with AWS cloudwatch EC2 metric CPUUtilization. Checking for any relevant deployments...

cirroe-bot commented 3 hours ago

Looked at all deployments on October 20th, seeing one that introduced potentially compute intensive code: https://github.com/CirroePlatform/Hermes/commit/334023fa55d83d558688816c82f2eaf4eb26382e

cirroe-bot commented 3 hours ago

Engaging oncall to rollback. Rollback command should be git reset --hard HEAD~1

AbhigyaWangoo commented 3 hours ago

LGTM, running command and closing out