-
Self-explanatory.
- Implement a dynamic widget preview
- Need to make sure ffmpeg extension is rebuilt and works for 16kb page sizes (This could possibly help speed up decoding)
- Investigate offlo…
-
There's a new cache technique mentioned in the paper https://arxiv.org/abs/2312.17238. (github: https://github.com/dvmazur/mixtral-offloading)
They introduced LRU cache to cache experts based on patt…
-
**Describe the bug**
A whole variety of periodic Gaussian tests are failing with LLVM offload. The restart tests are also failing.
These are in the nightlies and offloading to V100.
See : htt…
-
### What is the issue?
Not only it doesn't fit 96Gb (offloading only 10 layers out of 81), but processing actual ~128k request crashes with `CUDA error: out of memory` on 160Gb (will all layers off…
-
The Matplotlib graph often freezes when manipulated to change zoom, axis tilt, or rotation. This was first noticed when generating 3d models for the tract regions. This effected the utility of the pro…
-
Hello,
I wanted to ask you @lsalzman, whether ENet could see an optimization with GSO and sendmmsg (instead of the plain 'sendmsg') in order to optimize throughput?
( see: https://blog.cloudflare…
-
Hey, I noticed that when training with multiple batches I got crashes due to running out of vram.
Here is an example patch which will offload batches to cpu and copy them to gpu when required [Cpu_of…
-
Theoretically, we can proxy all of the client requests to be sent by the server, meaning we also technically don't need to load the client-side script at all.
This would require writing our own imp…
-
### Usage
Does Prefix Caching currently support offloading to the CPU?
If not, is there a plan to support it? Thanks~
-
In addition to LTO/Graphite, I also build with [Auto-Parallelisation](https://gcc.gnu.org/wiki/AutoParInGCC) where possible. I've converted my own custom flag management hacks over to gentooLTO inclu…