mit-submit / A2rchi

An AI Augmented Research Chat Intelligence for MIT's subMIT project in the physics department
MIT License
9 stars 6 forks source link

Post-Mortem: Changes to Prepare for Next Week #126

Open mdr223 opened 1 year ago

mdr223 commented 1 year ago

Summarizing the steps to begin hardening our system for next week. Please add anything I may have missed from our discussion.

  1. Merge PR #119 to move Data Manager into chat & cleo services
  2. Merge PR #109 to finish updates to A2rchi Meta
  3. Create PR to move lock from guarding OpenAI calls and only guard update(s) to vector store
  4. Merge this^ PR to main
  5. Write simple script w/for loop to submit N API calls to our Flask app, measure and plot histogram of latencies for N = [10, 100, 1000, 10000]
    • Call A2rchi at t3desk19.mit.edu:7683 ~5 times using the PSET 7 question to get some sense of the distribution for the latency
    • Simulate N calls to Flask app by deploying DumbLLM in dev but on t3desk19 with time.sleep(np.random.normal(mean, std)) where mean and std are guesstimates of latency parameters based on calls to t3desk19.mit.edu
    • Running experiment on t3desk is important b/c parallelism will be different compared with submit06
    • Record and store results
  6. Depending on performance results from 5., we may have to contemplate load-balancing w/multiple containers

And separately but also importantly:

  1. remove raise e from inside our main try-except block in ChatWrapper
julius-heitkoetter commented 1 year ago

One edit here is that Ludo raised an issue (#115) that the DumbLLM is no longer compatible. We need to resolve this issue before we get to step 5 with a PR

mdr223 commented 1 year ago

Another final addition we will likely want to make is merging in my work on the db backend once it's ready. Without that we will also need to lock the conversations.json file (which is plausible), but could harm performance.

ludomori99 commented 1 year ago

I believe we also discussed introducing timeouts in the requests in the app. Not sure if we want to open an issue about that.

mdr223 commented 1 year ago

With the new PR for request timeouts (#141), I believe we can close this issue once the PR is merged.