PKU-YuanGroup MoE-LLaVA issues

PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

https://arxiv.org/abs/2401.15947

Apache License 2.0

1.91k stars 121 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[Question] 论文table 7的 non-MoE LLaVA-phi的train scripts和eval scripts

#92 sharkdrop opened 12 hours ago
0
[Question] Is there any Moe checkpoint of Qwen1.5 or Qwen2 released?

#91 double-fire-0 opened 2 weeks ago
0
[Question] How to eval textqa

#90 fanminshi opened 3 weeks ago
1
[Question] Step 3 loss curve

#89 fanminshi opened 1 month ago
0
[Question] Question about the tokenizer of required pretrained model stabilityai/stablelm-2-1_6

#88 Taylorfire opened 1 month ago
1
[Question] In paper Table 6, why variant (d) is better than variant (c)?

#87 pkumc opened 1 month ago
0
[Feature request] 是否會訓練更進一步的模型

#86 gesen2egee opened 2 months ago
0
Training of Stage 3 , 第三阶段训练，代码中实际的训练参数与论文不符

#85 Wuyingwen opened 2 months ago
1
[Question] What exactly does the language model mean?

#84 dana-niu opened 2 months ago
0
[Discussion] What is the expert relationship between different layers with the same index? If not, what is the role of figures 4, 5 and 6 in the paper?

#83 meteorlium opened 3 months ago
0
[Question] ValueError: Unknown image tower: /hy-tmp/LLaVA/clip-vit-large-patch14-336

#82 FanshuoZeng opened 3 months ago
5
Can the confidence coefficient of an answer be obtained?

#81 IsabelJimenez99 opened 3 months ago
0
[Question] Inconsistency on MoE Layer Number in paper and model config

#80 QAQdev opened 4 months ago
0
[Usage] ADD windows support for more exposure

#79 mr-lab opened 4 months ago
0
can you please give me python script to use API with your demo ?

#78 gamesubzero opened 4 months ago
0
Moe finetuning error

#77 sahilqure opened 4 months ago
0
[Question] 多图collate_fn

#76 PangziZhang523 opened 4 months ago
0
Minor fix and tips update for README

#75 QAQdev closed 1 month ago
0
[Question] 能解释一下llava_arch中的class LlavaQWenMetaForCausalLM(LlavaMetaForCausalLM)这个类吗

#74 20191864218 opened 4 months ago
0
[Question] Pretrain step

#73 rlagustmd82 closed 4 months ago
0
[Question] CUDA OOM when finetune phi2-clipL336 at stage 2 with 8-A100-40G

#72 terry-for-github closed 4 months ago
1
[Feature request] Support Llama3

#71 xiweideng opened 4 months ago
0
[Question] About parameter ep_size

#70 puppy2000 opened 5 months ago
0
[Usage] tokenizer.pad_token_id == None？

#69 sjtu-cz opened 5 months ago
1
[Question] The error that occurred while running cli.py for inference, using Qwen-7B-base as the LLM.

#68 20191864218 closed 4 months ago
1
[Question] 论文参数讨论

#67 bufanx opened 5 months ago
1
[Question] 关于第三阶段训练loss

#66 rangmiao opened 6 months ago
0
[Usage] Deepspeed MoE hangs when EP_SIZE > 1

#65 Wadaxiwan closed 5 months ago
1
DeepSpeed MoE 问题

#64 BlackBearBiscuit opened 6 months ago
0
RuntimeError: mat1 and mat2 must have the same dtype

#63 Crystalxd opened 6 months ago
0
如何使用自己的数据集微调MoE-LLaVA

#62 Tunanzzz opened 6 months ago
4
[Question]Can't find the "mm_projecotr.bin" in the model_path

#61 sdlyzhq closed 6 months ago
0
[Question] The evaluation results vary every time.

#60 koda-11 opened 6 months ago
0
[Question] The evaluation results vary every time.

#59 koda-11 closed 6 months ago
0
[Question] Adding to the dataset.

#58 arthurwolf opened 6 months ago
0
[Question] 第二阶段微调的模型会开源吗？

#57 murray-z opened 6 months ago
1
[Question] 如何基于MoE模型，在自己的数据上进一步微调呢？

#56 murray-z opened 6 months ago
3
[Question] how to visualize routing distribution ?

#55 koda-11 closed 6 months ago
1
[Question] Model and Dataset Size

#54 adrielkuek opened 6 months ago
0
[Question] How did u using 768x768 resolution?

#53 lucasjinreal opened 6 months ago
0
[Question] How to finetune the moe-llava model on customized data?

#52 RayshenSL opened 6 months ago
0
[Usage] ValueError: Unknown image tower: /data1/ljq/Moellava/MoE-LLaVA-Qwen-1.8B-4e/clip-vit-large-patch14-336

#51 xiangchihuoguo opened 6 months ago
3
[Question] Scale down futher to support IOT usecases?

#50 kinchahoy opened 6 months ago
1
[Question] About nlp_tune data.

#49 Lucky-Lance opened 6 months ago
2
Error during training on custom dataset

#48 saeedkhaki92 opened 6 months ago
1
推理效率对比问题

#47 aprilehannibal opened 6 months ago
1
[Discussion] How to improve model's understanding of high-resolution images？

#46 whalefa1I opened 6 months ago
1
[Question] how to check activate parameters of MoE models?

#45 koda-11 closed 6 months ago
2
> Hi, everyone. Sorry for that, we updated the new runing command to fix it. Checking out [here](https://github.com/PKU-YuanGroup/MoE-LLaVA/blob/main/scripts/v1/qwen/finetune_moe.sh)

#44 hxhcreate closed 6 months ago
2
[Question] Image patch representation in this work

#43 cydiachen closed 6 months ago
1