Joe/lang 214 add mock cog triton concurrency test to cog triton directory

I recently had a regression such that increasing the concurrency of requests to local cog-triton substantially degraded performance. This was previously observed behavior and it was resolved. Accordingly, the regression was unexpected, inexplicable, and deeply frustrating.

In general, we do not have sufficient visibility into the isolated performance of the various components in our system. Is performance degraded because of Triton, TRT-LLM, cog, predict.py, or some interaction between two or more of these components? Who knows!

This PR targets this problem by adding /mock-cog-triton/ to the cog-triton repo. /mock-cog-triton/ includes the mocked predict.py that we've used to test cog performance.

Including it in cog-triton will make it easier for us to continuously validate cog-performance and, in general, isolate and observe the performance of the cog portion of our system.

This PR:

adds mock_cog_triton which includes:
- a symlink to ./cog.yaml
- a predict.py that emits tokens at specified rates
- a test_perf.py script that makes requests against the mock_cog_triton cog server and reports performance metrics
- new run instructions in ./README.md

replicate / cog-triton

Joe/lang 214 add mock cog triton concurrency test to cog triton directory #24