dvlab-research / LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
Apache License 2.0
693 stars 43 forks source link

The GPU's graphics card usage is also constantly increasing, #57

Closed kunkunsheng closed 8 months ago

kunkunsheng commented 8 months ago

When using cli. py for inference, as the problem continues, the GPU's graphics card usage also increases. Is there any way to prevent the GPU's graphics memory from increasing

yanwei-li commented 8 months ago

Hi, I guess this issue is caused by concating multi-turn conversations in each forward. If your conversation has several turns, you can reduce the conversation in conv in this line.

kunkunsheng commented 8 months ago

我想这个问题是由每个转发中的多轮对话引起的

How should I change the code so that each conversation is the first conversation for the model

kunkunsheng commented 8 months ago

Hi, I guess this issue is caused by concating multi-turn conversations in each forward. If your conversation has several turns, you can reduce the conversation in conv in this line.

Thank you, it has been solved. Just clear the cache of the video memory and it will be fine.