-
推理子模块
- [ ] ir
- [ ] 手动构图,
- [ ] 自动构图,
- [ ] 模型转换,
- [ ] 模型解释,
- [ ] 计算图构建,
- [ ] 图优化,
- [ ] 内存优化,
- [ ] 高性能算子
-
Just some non-commital ideas of features/protocols/algorithms to add (or delete) before the next release.
Feel free to add anything in the comments
-
### Description
Around the flashing step we should have an option to customize what is on the device.
Whether that is part of the built source code (eg. through environment variables), linked in…
-
Here is the development roadmap. Contributions and feedback are welcome.
# Project Roadmap
## 1. Dynamic Vector Length Support
- [ ] Implement variable vector length (not fixed at 128).
## 2…
-
KubePlus's multi-instance multi-tenancy architecture has the following use cases:
- application hosting
- platform engineering
- managed service delivery
The following features will enhance Kube…
-
Naive layperson food-for-thought question after skimming through some of the docs and codebase:
Could we make use of S3 (or [S7](https://rconsortium.github.io/S7/) if we want to be forward-looking)…
-
# TensorRT Model Optimizer - Product Roadmap
[TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer) (ModelOpt)’s north star is to be the best-in-class model optimization toolki…
-
### Initiative
npm 11
```[tasklist]
### Before release
- [ ] https://github.com/npm/cli/issues/7754
- [ ] https://github.com/npm/statusboard/issues/861
```
```[tasklist]
### For Release (breaking c…
-
### Description
We had the idea to create a roadmap in shiny app at some point, I wonder if that makes sense because it depends how we want to save the files along the workflow, I put the idea here, …
-
1. Port vllm/main feature to ROCm
- [x] Support Llama/Llama-2 models for v0.2.x
- [x] Support SqueezeLLM
- [x] Support YARN
- [x] Merge into upstream vllm (https://github.com/vllm-project/vllm/p…