-
Hi,
I've been running with an older image for quite a while successfully (undionly.kpxe from around 2020 or 2021) and this only started showing up with newer machines. I've updated to a current ver…
-
The NCCL timed out while using the zero3 model. How can I solve this problem?
I inherited the large model Mixtral 7BX8 and utilized the Llama architecture, augmenting it with multi-modal capa…
-
When using the HTML files provided by Mokuro (not to be confused with the separate Mokuro *Reader* app) and enabling Anki integration in the advanced settings, I can successfully create new cards by p…
-
The option to change screenshot format appears for me when saving images but disappears when copying to clipboard.
-
This program demonstrates a possible bug in the tester library with
array lists constructed using constructor blocks. The first test
does not use constructor blocks and works as expected. The second…
-
### 🐛 Describe the bug
When the flag `coordinate_descent_tuning` disabled, GPT-FAST-MOE encounters a large perf regression: 52s -> 1049s. The impact of disabling it on CPU is to fallback bmm and mm…
-
We know that `Transformer_Engine` provides context manager named [fp8_autocast](https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html#FP8-autocasting) to handle a…
-
Please ensure you have given all the following requested information in your report.
#### Issue details
I am trying to use code from Bullet Demo to detect collision using btPairCachingGhostObject
…
-
## Linux distribution and version
Gitlab Runner Running ubuntu:18.04 docker image
## Flatpak version
1.4.2-flatpak1~bionic
## Description of the problem
This is essentially issue #1326, how…
-
# Per-Parameter-Sharding FSDP
## Motivation
As we looked toward next-generation training, we found limitations in our existing FSDP, mainly from the _flat parameter_ construct. To address these, w…