-
Hello,
I am trying to implement model parallelism using PyTorch on my HPC environment, which has 4 GPUs available. My goal is to split a neural network model across these GPUs to improve training eff…
-
# Description of problem
after set `virtio_fs_cache_size = 2048`, create container error:
```
root@masyer:/home# ctr --debug run --runtime "io.containerd.kata.v2" --rm -t "docker.io/library/archlin…
-
When attempting: python stable_diffusion.py --optimize, I get a "TypeError: z_(): incompatible function arguments" error for "Optimizing text_encoder". Note that "Optimizing vae_encoder", "Optimizing …
-
## Improved Proposal
**Proposed System Summary**
Imagine instructing a computer in plain English to solve a complex problem. Our system would not only understand your request but would also …
ghost updated
4 weeks ago
-
### Describe the bug
Control (a Button for instance) fire multiple events (as long as hold buttons pressed) with KeyboardAccelerator attached.
### Steps to reproduce the bug
1) Create simple WinUI …
-
### Cautions:
**Before starting the task, please refer to [Add data of ML-YouTube-Courses](https://github.com/orgs/ocademy-ai/projects/3/views/1?filterQuery=label%3Adata&pane=issue&itemId=36101499)…
-
Hi,
I have three questions
1.what's difference between dl, xdl and wmma in /example/01_gemm
![image](https://github.com/ROCmSoftwarePlatform/composable_kernel/assets/22981348/c805b7df-2c93-4d01-b2a…
-
### Bug description
When configuring a `DDPStrategy` with multiple devices that do not use the `torch.cuda` API, we trigger the following exception:
```python
File "/home/hpclee1/rds/hpc-work/.…
-
The spec describes [Devices](https://github.com/opencontainers/runtime-spec/blob/master/config-linux.md#devices) that are container based, but there are another class of Devices, Network Devices that …
aojea updated
2 weeks ago
-