add Llava-SGlang - Githubissues

EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

https://lmms-lab.github.io/

Other

1.03k stars 53 forks source link

add Llava-SGlang #54

Closed jzhang38 closed 2 months ago

jzhang38 commented 2 months ago

Add llava_sglang.

Some caveats:

sglang currently only supports single-image input. We use the first image by default.
there is no concept of batch size in sglang. We use "parallel" instead.
use python -m instead of accelerate
sglang only supports tensor parallel (tp_size). It does not support data parallel

example eval config and script:

- model: llava_sglang
  model_args: pretrained=liuhaotian/llava-v1.6-34b,tokenizer=liuhaotian/llava-v1.6-34b-tokenizer,conv_template=chatml,tp_size=8,parallel=4
  tasks: mme
  batch_size: 1
  log_samples: true
  log_samples_suffix: eval_mme 
  output_path: "./logs/"


python -m lmms_eval --config config.yaml

Luodian commented 2 months ago

Thanks! This is a great feature enabling inference for larger models.

However, can you put a result screenshot or do more test to see if the results could match?

Luodian commented 2 months ago

Hi @jzhang38 🦦

jzhang38 commented 2 months ago

1.5 7B:

- model: llava_sglang
  model_args: pretrained=liuhaotian/llava-v1.5-7b
  tasks: mme,ai2d,scienceqa_img
  batch_size: 1
  log_samples: true
  log_samples_suffix: eval_mme 
  output_path: "./logs/"

Tasks	Version	Filter	Metric	Value		Stderr
mme	Yaml	none	mme_cognition_score	352.5000	±	N/A
		none	mme_percetion_score	1511.3936	±	N/A
ai2d	Yaml	none	exact_match	55.6023	±	0.0089
scienceqa_img	Yaml	none	exact_match	69.5092	±	0.0103

Match pretty closely

jzhang38 commented 2 months ago

1.5 13B:

- model: llava_sglang
  model_args: pretrained=liuhaotian/llava-v1.5-13b
  tasks: mme,ai2d,scienceqa_img
  batch_size: 1
  log_samples: true
  log_samples_suffix: eval_mme 
  output_path: "./logs/"

Tasks	Version	Filter	Metric	Value		Stderr
ai2d	Yaml	none	exact_match	59.1645	±	0.0088
mme	Yaml	none	mme_cognition_score	295.0000	±	N/A
		none	mme_percetion_score	1523.5189	±	N/A
scienceqa_img	Yaml	none	exact_match	72.8309	±	0.0099

jzhang38 commented 2 months ago

@Luodian