-
- https://www.ralfj.de/blog/
- https://noidea.dog/glue
- https://xlinux.nist.gov/dads/
- https://blog.sulami.xyz/posts/what-is-in-a-rust-allocator/
- https://quickwit.io/blog/performance-investiga…
-
## Describe This Problem
We found in production that the speed of sst compaction is unable to keep up with the speed of sst generation, leading to poor query performance. However we are unable give…
-
### Is your feature request related to a problem? Please describe
We removed the misleading indicator `CPU` on the Model tables. But it would be interesting for the user to have some indication if th…
-
### Describe the bug
Sequential offloading doesn't work when using `pytest`, but does seem to work outside of tests.
This is an issue, because we can't properly test sequential offloading on Stabl…
-
**Describe the bug**
The checksum offloading capabilities of an Ethernet interface are not respected for virtual network interfaces as the function `need_calc_checksum()` in "net_if.c" always returns…
-
It seems like pipelining could possibly greatly simplify the implementation of a feature such as fairscale's OffloadModel https://fairscale.readthedocs.io/en/latest/deep_dive/offload.html
Is this s…
-
Hi, thanks for the great library! I have heard some people saying EXL2 being very fast, but I would like to try the 70B llama model on a 24GB 4090 card, so it cannot be fit into the GPU using e.g. 4bi…
-
**Problem**
Jan is great, but I'm limited o the number of models I can run on my 16GB GPU. I saw there is a project called [mixtral-offloading](https://github.com/dvmazur/mixtral-offloading) that cou…
-
可以相对低资源的训练较大模型了,感谢大佬们
-
**Describe the bug**
When using a SmoothQuantModifier and cpu offloading there is a conflict of tensors not being on the right device.
**Expected behavior**
cpu offloading should work w/ SmoothQu…