-
### ⚠️ Please check that this feature request hasn't been suggested before.
- [X] I searched previous [Ideas in Discussions](https://github.com/OpenAccess-AI-Collective/axolotl/discussions/categories…
-
Hi community and authors,
As in the official document [1] states, SFs support E-Switch representation offload like existing PF and VF representors. However, a simple rte_flow rule cannot be install…
-
Upgraded from 22.03.0-rc1 to rc4 with fresh config on my Archer C7v5 and I noticed my speedtest dropped from 500/100 to 240/100.
Also occur on my Deco M4Rv2 with Openwrt 22.03.0-rc4 firmware installe…
-
### System Info
During inference of larger models in VRAM constrained environments, offloading unused model layers from VRAM to RAM is an easy method of reducing the overall VRAM usage. Linear8bitLt …
-
### What is the issue?
Not only it doesn't fit 96Gb (offloading only 10 layers out of 81), but processing actual ~128k request crashes with `CUDA error: out of memory` on 160Gb (will all layers off…
-
Is anyone else having issues using articulated tractors to unload combines? They seem to get to the combine great, but when the get under the auger, they start turning side to side faster and faster …
-
Is it currently possible to offload Wallpaper Engine to use the dedicated (NVIDIA) GPU over defaulting to integrated GPU, in regard to rendering scene wallpapers?
I've looked through older issue, w…
-
### Describe the bug
i noticed that when i add ```align_device_hook``` to module in pipeline manually, then ```load_lora_weights``` function will enable the sequential cpu offload. so i dig deeper a…
-
the modem driver doesn't support offloading of TLS sockets and certificate management,
the modem itself has a cipher suite and a certificate manager controlled by AT commands.
I see that other mod…
-
I want to use deepspeed for inference but i am not able to correctly load the model using deepspeed. As per my understanding of theory, deepspeed should load all the model weights on cpu or Nvme. But …