-
### 🐛 Describe the bug
Repro
~~~
import torch
mod = torch.nn.Linear(10, 10).cuda()
@torch.compile(backend="aot_eager")
def generate(x, c):
return mod(x) + c
for _ in range(…
-
For all loaded tensors, we currently stream in the Tensors with an evict first policy as is represented by the `.cop` field value of `.cs`.
https://github.com/NVIDIA/Fuser/blob/7aaa6c3487e29c40ec91…
-
反馈bug/问题模板,提建议请删除
## 1.关于你要提交的问题
Q:是否搜索了issue (使用 "x" 选择)
* [] 没有类似的issue
## 2. 详细叙述
### (1) 具体问题
A:关于在活动连接、客户端多的时候,软中断变多,且CPU占用会变高,网速变慢的问题
目前连接数在5000左右,客户端在65左右,使用top命令查看占用情况,会发现ksof…
-
Related to https://github.com/scikit-learn/scikit-learn/issues/27086 but specialized to the regression observed in `HistGradientBoostingClassifier` for better tracking.
With cython 3, it looks that…
-
# How to Instrument a PhET Simulation for PhET-iO
## Before instrumenting
- [x] Create a "PhET-iO Instrumentation" issue in the simulation repository. Copy this checklist to the issue
descriptio…
-
**Why do you need support for this specific architecture?**
Those are modern ARM CPUs
**Which architecture model, family and further information? CPU or accelerator?**
Cortex-A72, Cortex-A78
*…
-
Wondering if anyone has done investigation into performance of these two hooks. I'd assume that this unlocked performance benefits in two areas: CPU/memory usage and network request waterfalls (due to…
-
Starting a thread to find ideas for folks to propose alternative naming ideas. If you have an idea, please post it here in a single comment. 👍 for ideas that people like. If your name is not descripti…
-
### 🐛 Describe the bug
```
2023-05-08T10:46:35.4955096Z cuda eval llama ERROR:common:backend='inductor' raised:
2023-05-08T10:46:35.4956007Z AssertionError: slice.Ten…
-
I think it could be fun/interesting to work on making optcarrot run a little faster, so I did some profiling of optcarrot with Kokubun's dynamic send patch.
Method used:
```
ruby -I harness-conti…