issues
search
dvmazur
/
mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
MIT License
2.29k
stars
227
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Hard to benchmark the operation in the repo
#39
mynotwo
opened
3 months ago
1
Mixtral Instruct tokenizer from Colab notebook doesn't work.
#38
jmuntaner-smd
opened
4 months ago
2
Change of query weight matrices shapes
#37
avani17101
closed
6 months ago
0
Support DeepSeek V2 model
#36
Minami-su
opened
6 months ago
0
Having issue loading my HQQ quantized model
#35
BeichenHuang
opened
7 months ago
0
How to split the model parameter safetensors file into multiple small files
#34
YLSnowy
opened
7 months ago
5
Implementation of benchmarks (C4 perplexity, Wikitext perplexity)
#33
ChengSashankh
opened
7 months ago
0
runtimeerror when nbit = 4 and group_size =64
#32
Eutenacity
opened
7 months ago
0
Trition Issues in Running the Code Locally
#31
amangupt01
closed
7 months ago
1
Can this be used for Jambo inference
#30
freQuensy23-coder
opened
7 months ago
1
FastAPI Integration and Performance Benchmarking
#29
Jnmz
opened
7 months ago
0
(Colab) Clear GPU RAM usage after running the generation code without restarting instance
#28
ScottLinnn
closed
7 months ago
1
Update build_model.py
#27
fire717
opened
8 months ago
0
a strange issue with default parameters " RuntimeError about memory"
#26
a1564740774
opened
8 months ago
0
Update Requirements.txt
#25
Soumadip-Saha
opened
9 months ago
0
Run on second GPU (torch.device("cuda:1"))
#24
imabot2
opened
9 months ago
1
4bit-3bit model produces gibberish when plugged into demo
#23
jjblum
opened
10 months ago
0
Run without quantization
#22
freQuensy23-coder
opened
10 months ago
9
hqq_aten package not installed.
#21
LeMoussel
closed
10 months ago
2
Update typo in README.md
#20
kaushalpowar
opened
10 months ago
0
need mixtral offload for NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO
#19
githubpradeep
opened
10 months ago
0
CUDA OOM errors in wsl2
#18
MrNova111
opened
10 months ago
0
Is it possible to finetune this on a custom dataset?
#17
asmith26
opened
10 months ago
8
Can it run with LlamaIndex?
#16
LeMoussel
opened
10 months ago
0
Can it run on multi-GPU?
#15
drdh
opened
11 months ago
10
Doesn't work
#14
SanskarX10
closed
11 months ago
11
How to use the offloading in my MoE model?
#13
WangRongsheng
closed
11 months ago
4
CLI interface added
#12
NJannasch
opened
11 months ago
3
Mixtral OffLoading/GGUF/ExLlamaV2, which approach to use?
#11
LeMoussel
opened
11 months ago
1
Enhancing the Efficacy of MoE Offloading with Speculative Prefetching Strategies
#10
yihong1120
opened
11 months ago
1
Utilized pop for meta keys cleanup
#9
vivekmaru36
closed
10 months ago
1
Update README.md
#8
eltociear
closed
11 months ago
1
Session crashed on colab
#7
bitsnaps
closed
11 months ago
4
Revert "Some refactoring"
#6
lavawolfiee
closed
11 months ago
0
Some refactoring
#5
lavawolfiee
closed
11 months ago
0
exl2
#4
eramax
opened
11 months ago
2
Refactor
#3
dvmazur
closed
11 months ago
0
adding requirements.txt
#2
h9-tect
opened
11 months ago
1
Fix colab
#1
dvmazur
closed
11 months ago
0