issues
search
dvmazur
/
mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
MIT License
2.27k
stars
224
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Mixtral Instruct tokenizer from Colab notebook doesn't work.
#38
jmuntaner-smd
opened
1 day ago
0
Change of query weight matrices shapes
#37
avani17101
closed
1 month ago
0
Support DeepSeek V2 model
#36
Minami-su
opened
1 month ago
0
Having issue loading my HQQ quantized model
#35
BeichenHuang
opened
2 months ago
0
How to split the model parameter safetensors file into multiple small files
#34
YLSnowy
opened
2 months ago
5
Implementation of benchmarks (C4 perplexity, Wikitext perplexity)
#33
ChengSashankh
opened
2 months ago
0
runtimeerror when nbit = 4 and group_size =64
#32
Eutenacity
opened
2 months ago
0
Trition Issues in Running the Code Locally
#31
amangupt01
closed
2 months ago
1
Can this be used for Jambo inference
#30
freQuensy23-coder
opened
3 months ago
1
FastAPI Integration and Performance Benchmarking
#29
Jnmz
opened
3 months ago
0
(Colab) Clear GPU RAM usage after running the generation code without restarting instance
#28
ScottLinnn
closed
2 months ago
1
Update build_model.py
#27
fire717
opened
3 months ago
0
a strange issue with default parameters " RuntimeError about memory"
#26
a1564740774
opened
3 months ago
0
Update Requirements.txt
#25
Soumadip-Saha
opened
4 months ago
0
Run on second GPU (torch.device("cuda:1"))
#24
imabot2
opened
5 months ago
1
4bit-3bit model produces gibberish when plugged into demo
#23
jjblum
opened
5 months ago
0
Run without quantization
#22
freQuensy23-coder
opened
5 months ago
9
hqq_aten package not installed.
#21
LeMoussel
closed
5 months ago
1
Update typo in README.md
#20
kaushalpowar
opened
5 months ago
0
need mixtral offload for NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO
#19
githubpradeep
opened
5 months ago
0
CUDA OOM errors in wsl2
#18
MrNova111
opened
5 months ago
0
Is it possible to finetune this on a custom dataset?
#17
asmith26
opened
6 months ago
8
Can it run with LlamaIndex?
#16
LeMoussel
opened
6 months ago
0
Can it run on multi-GPU?
#15
drdh
opened
6 months ago
10
Doesn't work
#14
SanskarX10
closed
6 months ago
11
How to use the offloading in my MoE model?
#13
WangRongsheng
closed
6 months ago
4
CLI interface added
#12
NJannasch
opened
6 months ago
3
Mixtral OffLoading/GGUF/ExLlamaV2, which approach to use?
#11
LeMoussel
opened
6 months ago
1
Enhancing the Efficacy of MoE Offloading with Speculative Prefetching Strategies
#10
yihong1120
opened
6 months ago
1
Utilized pop for meta keys cleanup
#9
vivekmaru36
closed
6 months ago
1
Update README.md
#8
eltociear
closed
6 months ago
1
Session crashed on colab
#7
bitsnaps
closed
6 months ago
4
Revert "Some refactoring"
#6
lavawolfiee
closed
6 months ago
0
Some refactoring
#5
lavawolfiee
closed
6 months ago
0
exl2
#4
eramax
opened
6 months ago
2
Refactor
#3
dvmazur
closed
6 months ago
0
adding requirements.txt
#2
h9-tect
opened
6 months ago
1
Fix colab
#1
dvmazur
closed
6 months ago
0