-
Incredible project, i managed to run the model with good speed on my hardware (AMD) thanks.
I have a question do you have any plans to offload the weights and be able to run bigger models like 13B o…
-
I was wondering if the Intel SHMEM library also works with OpenMP offloading, i.e. can the Intel SHMEM routines be called within `#pragma omp target` regions?
Edit: So I'm interested in this hybrid…
-
### What is the issue?
I've noticed there are other recent issues on offloading, however, since I'm using a different setup I thought opening a seperate issue would make sense. I neither use docker…
oleid updated
2 months ago
-
| Metadata | |
| -------- | --- |
| Owner(s) | @ZuseZ4 |
| Team(s) | [compiler](http://github.com/rust-lang/compiler-team), [lang](http://github.com/rust-lang/lang-team) |
| Goal…
-
When run with `pipe.enable_model_cpu_offload()`, it shows an error:
```
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking ar…
-
The Adam optimizer can consume a large amount of GPU memory, potentially causing OOM (Out Of Memory) errors during training. To free up memory during forward/backward passes, there is a need for a fea…
-
## Precommit CI Run information
Logs can be found in the associated Github Actions run: https://github.com/ewlu/gcc-precommit-ci/actions/runs/10956983785
## Patch information
Applied patches: 1 -> 1
A…
-
I'm testing [kv reuse feature](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/kv_cache_reuse.md)
Everything works fine until i try to use [offloading to host mem](https://github.com/N…
-
# Summary
Support the offloading of the following IPv6 networking functions which already work in the non-offloading path.
- VM to VM communication on same Hypervisor.
- VM to VM communication di…
-
Hi,
Since it is common to use with deepspeed zero w/ offloading when training large LLM, does TE currently support in this mode?
Currently deepspeed support is just unittest as refered by TE's r…