How much video memory is required for a single card to run two models？

reedest7 commented 3 weeks ago

System Info

single card 3090,video memory 24GB

Who can help?

No response

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the codebase (such as scrips/, ...)
[ ] My own task or dataset (give details below)

Reproduction

POLICY_MODEL=Qwen2.5-Math-1.5B-Instruct；VALUE_MODEL_NAME=math-shepherd-mistral-7b-prm。When running sh reason/llm/service/generate_service_math_sthepherd.sh, only vllmw_worker can be started and the reward model cannot be started. Error OOM reported.

Expected behavior

May I ask how much video memory is required for a single card to run two models

caihuaiguang commented 3 weeks ago

2 48G GPUs (this is my device setting), to run the service (sh reason/llm_service/create_service_math_shepherd.sh), with RM and LM models on separate GPU.

Dada-Cloudzxy commented 1 week ago

The 1.5B model actually runs the entire A100_80G( inference service, 4 lm_workers in 4 GPUs ).

openreasoner / openr