-
### š The feature, motivation and pitch
PyTorch does not seem to provide wrappers to mixed precision algorithms in e.g MAGMA (dshpov, shportf) and cuSolver (https://docs.nvidia.com/cuda/cusolver/indeā¦
-
We should find scheduling algorithm to reduce GPU idle time.
-
1. We observe a sudden abnormal increase ( 2-3x) in the collective communications with all reduce from 1MB and beyond with GPUs. You can reproduce this issue by measuring the time taken to complete 10ā¦
-
Is there any known solver instability when solving with GPU? E.g. I get the following behavior
```
----------------------------------------------------------------------------
SCS v1.2.6 - Splitā¦
-
As I understand, through a tag based system, the plugin can assign different compute targets for each node as follows:
```yaml
# excerpt from vertexai.yml
# see https://kedro-vertexai.readthedoā¦
-
Making FSDP auto-tune. There are many knobs that users can tune today with FSDP for both scaling and performance.
-
Sorry, I have some questions to askļ¼
1ćIf I set num_local_experts = 2, it means that every gpu has two experts? and the two expert parameters exist on the one gpu?
2ćIf I set num_local_experts = -2, ā¦
luuck updated
14 hours ago
-
-
# Bucket Hashing: Layered Permutation and Mixing Hash (LPMH)
## Overview
**Layered Permutation and Mixing Hash**The Layered Permutation and Mixing Hash (LPMH) algorithm is designed to offer enhancā¦
-
Hi author,
Thank you for the great work. The algorithm runs very fast!
However, I think the current algorithm does not consider the corner case with just single GPU (n=1), and in this case, the aā¦