-
Repost from the [PyTorch forum](https://discuss.pytorch.org/t/flex-attention-gaps-in-profiler/211917/1)
I have recently been playing with Flex attention, trying to replace some of my custom triton …
-
### Your current environment
docker with vllm/vllm-openai:v0.4.3 (latest)
### 🐛 Describe the bug
python3 -m vllm.entrypoints.openai.api_server --model ./Qwen1.5-72B-Chat/ --max-model-len 2400…
-
```go
// New wraps a handler and aborts the process of the handler if the timeout is reached
func New(opts ...Option) gin.HandlerFunc {
t := &Timeout{
timeout: defaultTimeout,
handler: nil…
-
### 确认
- [X] 我的版本是最新版本,我的版本号与 [version](https://github.com/hsuyelin/nas-tools/releases/latest) 相同。
- [X] 我已经 [issue](https://github.com/hsuyelin/nas-tools/issues) 中搜索过,确认我的问题没有被提出过。
- [X] 我已经修改标题,将标题…
-
## Expected Behavior
I expect `AssertNumberOfCalls` to succeed only with one given “expectedCalls” value.
In the “Steps to reproduce” section below, there are two assertions to assert the number…
-
First of all, thanks for creating this project. It is awesome! I forked it to add a command line version of the app. I'm using it to minify code for the game Space Engineers which uses C# for in ga…
-
**Github username:** @maikelordaz
**Twitter username:** maikelordaz
**Submission hash (on-chain):** 0x4d0c9c61e1043cc911bb9f4320d7688977aa3468930532095e1d1b136e6e2c2f
**Severity:** medium
**Descript…
-
## Fix the model test for `llama_v2_7b_16h.py`
1. setup env according to [Run a model under torch_xla2](https://github.com/pytorch/xla/blob/master/experimental/torch_xla2/docs/support_a_new_model.m…
-
This proposal is about providing a way to pass a `template.FuncMap` at template execution overriding specific functions within the template.
When executing a template using one of new methods, then…
-
It would be *amazing* if we could call asynchronous Swift functions from Rust.
The only problem is: `_@cdecl` doesn't allow us to mark our functions as asynchronous:
`src-swift/lib.swift`
```sw…