OpenGVLab / LAMM

[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents
https://openlamm.github.io/
284 stars 15 forks source link

How to change options to decrease calculation consumption during evaluation_3d ? #48

Closed Xiaolong-RRL closed 6 months ago

Xiaolong-RRL commented 8 months ago

Dear author:

Thanks for your interesting work.

I wonder how to change options as following to decrease calculation consumption during evaluation_3d, as inference_3d.py does not have these args.

--cfg ./config/train_ds3.yaml   # enable Deepspeed ZeRO stage3
--use_flash_attn    # enable flash attention
--use_xformers      # enable xformers

Best! Xiaolong

wangjiongw commented 7 months ago

Sorry for late reply. Recently we refactor our code and now we evaluate trained MLLMs via CheF. To decrease computation cost during evaluation of LAMM, now we support LightLLM, which contains flash attention and other techniques. Therefore, there is no need to declare flash attention and xformer during inference.

In fact, flash-attn and xformers currently only works during training and only lightllm works in inference.

For your reference, you can add following line in evaluation configuration file:

use_lightllm: True

Note, lightllm mainly aims at inference, so during training, you might add these two lines to your config:

use_flash_attn: True
use_xformers: True

The flags are supported in LAMM models, and maybe prompted to other models accordingly.

wangjiongw commented 6 months ago

This issue will be closed. Please reopen it if any further questions.