-
# Description of problem
I created Kata Container using Docker runtime option.
`sudo docker run -it --runtime io.containerd.kata.v2 ubuntu:22.04 /bin/bash
`
Inside the container, network is work…
-
OSv v0.09
eth0: 192.168.122.15
page fault outside application, addr: 0x0000000000000000
[registers]
RIP: 0x000000000041cf65
RFL: 0x0000000000010246 CS: 0x0000000000000008 SS: 0x0000000000000010
R…
-
### 🐛 Describe the bug
We found that the flops counter is reporting incorrect flops number for sdpa operations.
This issue is not in torch 2.4+cu121 release.
Repro code:
```
from torch.ut…
-
systemctl status mongod.service
× mongod.service - MongoDB Database Server
Loaded: loaded (/lib/systemd/system/mongod.service; enabled; vendor preset: enabled)
Active: failed (Result: co…
-
Following up on Brendan Gregg's excellent http://www.brendangregg.com/blog/2020-11-04/bpf-co-re-btf-libbpf.html I looked into packaging the `libbpf-tools` binaries (`biolatency`, `biopattern`, `biosno…
-
### 🐛 Describe the bug
When the model is compiled using torch.compile the backward graph has dynamic shaped tensors but the same model when wrapped with compiled_autograd.enable() the backward grap…
-
There is missing NetBSD tutorial.
I wrote a README notes on a NetBSD mailing list.
http://mail-index.netbsd.org/netbsd-users/2019/02/13/msg022207.html HAXM in pkgsrc
```
HAXM has been import…
-
### 🐛 Describe the bug
I had an issue in one of the services I work on, where it would use more and more memory until crashing. After some digging around I was able to reduce it to the following scri…
-
Thanks for your error report and we appreciate it a lot.
**Checklist**
* I have searched the tutorial on modelscope [doc-site](https://modelscope.cn/docs)
* I have searched related issues but …
mrzjl updated
1 month ago
-
### 🐛 Describe the bug
I am running the multi-node training of T5-11B using FSDP. Running this with 5 nodes each 8 A100 40 GB works fine with PT 1.13.1 and PT 2.0, however this runs into OOM with P…