-
Hey everyone, awesome project :-) am having fun playing around with it, but I think my GPU isn't being utilised. I can see my CPU maxing out, and not seeing much of a change in my GPU usage, just wond…
-
There's a new cache technique mentioned in the paper https://arxiv.org/abs/2312.17238. (github: https://github.com/dvmazur/mixtral-offloading)
They introduced LRU cache to cache experts based on patt…
-
### Expected Behavior
Not 10Gb Vram eaten using the lora.
### Actual Behavior
I have flux fp8 schnell on a 3090, I run two loras rank 64 onto the model, but it uses all VRAM until it starts offload…
-
# Summary
We enabled node status offload and workflows archiving, and we have observed some performance and stability issues.
- there are many slow queries of mysql when running thousands of wor…
-
It would be useful if Flatpak had a way to uninstall specific applications, keeping only the .desktop file and icons so a placeholder launcher exists.
Clicking on the .desktop file would reinstall …
-
Once uploaded, files should be kept locally for a short time before being offloaded to `repo.spongepowered.org`.
This means you need:
- [ ] small background task which waits for new files and periodi…
-
Hello,
I wanted to ask you @lsalzman, whether ENet could see an optimization with GSO and sendmmsg (instead of the plain 'sendmsg') in order to optimize throughput?
( see: https://blog.cloudflare…
-
Although algorithm (static) class templates should not care about where computation is performed (CPU or GPU), I think there are a few design choices that motivate parameterizing the algorithm itself …
-
Hi,
Since it is common to use with deepspeed zero w/ offloading when training large LLM, does TE currently support in this mode?
Currently deepspeed support is just unittest as refered by TE's r…
-
### Describe the bug
i noticed that when i add ```align_device_hook``` to module in pipeline manually, then ```load_lora_weights``` function will enable the sequential cpu offload. so i dig deeper a…