Closed RR4787 closed 1 year ago
Could you please include some evidence that these work in the PR description? Ideally we would have tests that at least try to start the server in CI, but understand if not top priority right now, and would like some manual tests in the PR description at least.
updates yamls and model handlers to handle go server/batching. Leaving diffusion since its broken and needs to be fixed. Change attn_cnfg['attn_impl'] in mpt7b handler from 'triton' to 'torch' for the time being while dependency issues are sorted out in the go server
Manually tested deploying the models with go server and using a script to test batching requests.