issues
search
PKU-YuanGroup
/
MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
https://arxiv.org/abs/2401.15947
Apache License 2.0
1.91k
stars
121
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Question] 论文table 7的 non-MoE LLaVA-phi的train scripts和eval scripts
#92
sharkdrop
opened
12 hours ago
0
[Question] Is there any Moe checkpoint of Qwen1.5 or Qwen2 released?
#91
double-fire-0
opened
2 weeks ago
0
[Question] How to eval textqa
#90
fanminshi
opened
3 weeks ago
1
[Question] Step 3 loss curve
#89
fanminshi
opened
1 month ago
0
[Question] Question about the tokenizer of required pretrained model stabilityai/stablelm-2-1_6
#88
Taylorfire
opened
1 month ago
1
[Question] In paper Table 6, why variant (d) is better than variant (c)?
#87
pkumc
opened
1 month ago
0
[Feature request] 是否會訓練更進一步的模型
#86
gesen2egee
opened
2 months ago
0
Training of Stage 3 , 第三阶段训练,代码中实际的训练参数与论文不符
#85
Wuyingwen
opened
2 months ago
1
[Question] What exactly does the language model mean?
#84
dana-niu
opened
2 months ago
0
[Discussion] What is the expert relationship between different layers with the same index? If not, what is the role of figures 4, 5 and 6 in the paper?
#83
meteorlium
opened
3 months ago
0
[Question] ValueError: Unknown image tower: /hy-tmp/LLaVA/clip-vit-large-patch14-336
#82
FanshuoZeng
opened
3 months ago
5
Can the confidence coefficient of an answer be obtained?
#81
IsabelJimenez99
opened
3 months ago
0
[Question] Inconsistency on MoE Layer Number in paper and model config
#80
QAQdev
opened
4 months ago
0
[Usage] ADD windows support for more exposure
#79
mr-lab
opened
4 months ago
0
can you please give me python script to use API with your demo ?
#78
gamesubzero
opened
4 months ago
0
Moe finetuning error
#77
sahilqure
opened
4 months ago
0
[Question] 多图collate_fn
#76
PangziZhang523
opened
4 months ago
0
Minor fix and tips update for README
#75
QAQdev
closed
1 month ago
0
[Question] 能解释一下llava_arch中的class LlavaQWenMetaForCausalLM(LlavaMetaForCausalLM)这个类吗
#74
20191864218
opened
4 months ago
0
[Question] Pretrain step
#73
rlagustmd82
closed
4 months ago
0
[Question] CUDA OOM when finetune phi2-clipL336 at stage 2 with 8-A100-40G
#72
terry-for-github
closed
4 months ago
1
[Feature request] Support Llama3
#71
xiweideng
opened
4 months ago
0
[Question] About parameter ep_size
#70
puppy2000
opened
5 months ago
0
[Usage] tokenizer.pad_token_id == None?
#69
sjtu-cz
opened
5 months ago
1
[Question] The error that occurred while running cli.py for inference, using Qwen-7B-base as the LLM.
#68
20191864218
closed
4 months ago
1
[Question] 论文参数讨论
#67
bufanx
opened
5 months ago
1
[Question] 关于第三阶段训练loss
#66
rangmiao
opened
6 months ago
0
[Usage] Deepspeed MoE hangs when EP_SIZE > 1
#65
Wadaxiwan
closed
5 months ago
1
DeepSpeed MoE 问题
#64
BlackBearBiscuit
opened
6 months ago
0
RuntimeError: mat1 and mat2 must have the same dtype
#63
Crystalxd
opened
6 months ago
0
如何使用自己的数据集微调MoE-LLaVA
#62
Tunanzzz
opened
6 months ago
4
[Question]Can't find the "mm_projecotr.bin" in the model_path
#61
sdlyzhq
closed
6 months ago
0
[Question] The evaluation results vary every time.
#60
koda-11
opened
6 months ago
0
[Question] The evaluation results vary every time.
#59
koda-11
closed
6 months ago
0
[Question] Adding to the dataset.
#58
arthurwolf
opened
6 months ago
0
[Question] 第二阶段微调的模型会开源吗?
#57
murray-z
opened
6 months ago
1
[Question] 如何基于MoE模型,在自己的数据上进一步微调呢?
#56
murray-z
opened
6 months ago
3
[Question] how to visualize routing distribution ?
#55
koda-11
closed
6 months ago
1
[Question] Model and Dataset Size
#54
adrielkuek
opened
6 months ago
0
[Question] How did u using 768x768 resolution?
#53
lucasjinreal
opened
6 months ago
0
[Question] How to finetune the moe-llava model on customized data?
#52
RayshenSL
opened
6 months ago
0
[Usage] ValueError: Unknown image tower: /data1/ljq/Moellava/MoE-LLaVA-Qwen-1.8B-4e/clip-vit-large-patch14-336
#51
xiangchihuoguo
opened
6 months ago
3
[Question] Scale down futher to support IOT usecases?
#50
kinchahoy
opened
6 months ago
1
[Question] About nlp_tune data.
#49
Lucky-Lance
opened
6 months ago
2
Error during training on custom dataset
#48
saeedkhaki92
opened
6 months ago
1
推理效率对比问题
#47
aprilehannibal
opened
6 months ago
1
[Discussion] How to improve model's understanding of high-resolution images?
#46
whalefa1I
opened
6 months ago
1
[Question] how to check activate parameters of MoE models?
#45
koda-11
closed
6 months ago
2
> Hi, everyone. Sorry for that, we updated the new runing command to fix it. Checking out [here](https://github.com/PKU-YuanGroup/MoE-LLaVA/blob/main/scripts/v1/qwen/finetune_moe.sh)
#44
hxhcreate
closed
6 months ago
2
[Question] Image patch representation in this work
#43
cydiachen
closed
6 months ago
1
Next