perf: bump vllm container cpu memory from 128M to 1024M

Details

previously, the fastapi completion endpoint's memory was bumped to 1024. however, there is a container boundary that the web endpoint crosses when actually doing the generation -- bumping the resources on the other side to measure its impact

Code of Conduct

[ ] I agree to follow this project's Code of Conduct
[ ] I agree to license this contribution under the MIT LICENSE
[ ] I checked the current PR for duplication.

OpenRouterTeam / openrouter-runner

perf: bump vllm container cpu memory from 128M to 1024M #53

Details

Code of Conduct