-
Fresh ubuntu 14.04 installation on a vagrant machine.
After installation when I run `triton profile create`, I get
```
assert.js:81
throw new assert.AssertionError({
^
AssertionError: 'fai…
-
the backend qwen model does not enable the decouple mode(streaming), however i found the openapi response did not show token usage. Below is an example:
ChatCompletion(id='cmpl-45f33530-2dcc-4352-8…
-
Here's the overall architecture of Triton:
![image](https://user-images.githubusercontent.com/166481/82379259-74854500-99db-11ea-9928-99370fb74d34.png)
In scope:
- Triton server
- Client SDKs …
yuryu updated
4 years ago
-
Platforms: inductor
This test was disabled because it is failing in CI. See [recent examples](https://hud.pytorch.org/flakytest?name=test_comprehensive_nn_functional_batch_norm_cuda_float64&suite=Tes…
-
### Describe the bug
In the current version of `openai-triton`, `v2.1.0`, which is used to build pytorch, there's a [function that calls `ldconfig -p`](https://github.com/openai/triton/blob/da40a1e98…
-
vocab_size not found in data/openwebtext/meta.pkl, using GPT-2 default of 50257
Initializing a new model from scratch
number of parameters: 124.34M
compiling the model... (takes a ~minute)
To use …
-
Deploying a pair of managers in an on premise Triton installation works like charm. triton-kubernetes talks to the api to learn images, packages and networks.
Deploying a cluster (and nodes for it…
-
### Required prerequisites
- [X] Make sure you've read the [documentation](https://pybind11.readthedocs.io). Your issue may be addressed there.
- [X] Search the [issue tracker](https://github.com/pyb…
-
## 🚀 Motivation
Since 2.4 released ~3 months ago PyTorch is officially supporting Python 3.12 (the `torch.compile` was a bit lagging behind due to changes in the CPython interface). It would be gre…
-
bge m3有哪些部署方案呢?可以使用triton部署吗?如何做?