rese1f / MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
https://rese1f.github.io/MovieChat/
BSD 3-Clause "New" or "Revised" License
533 stars 41 forks source link

How to Properly Reset Long Term Memory for Multiple Video Inferences Without Reloading the Model? #58

Open JUNJIE99 opened 7 months ago

JUNJIE99 commented 7 months ago

Hi, thank you for your great work!

I am using the inference.py script for processing multiple videos sequentially. Currently, I have to reload the model to reset the long term memory for each new video, which is quite resource-intensive.

Is there a more efficient way to reinitialize the long term memory without reloading the model each time? Any advice would be highly appreciated.

Thank you for your attention and I look forward to your suggestions.

JUNJIE99 commented 7 months ago

More specifically, I've noticed that the following block of code, which initializes the model and its components, needs to be executed for each new video:

model_config = cfg.model_cfg
model_config.device_8bit = args.gpu_id
model_cls = registry.get_model_class(model_config.arch)
model = model_cls.from_config(model_config).to('cuda:{}'.format(args.gpu_id))
vis_processor_cfg = cfg.datasets_cfg.webvid.vis_processor.train
vis_processor = registry.get_processor_class(vis_processor_cfg.name).from_config(vis_processor_cfg)
chat = Chat(model, vis_processor, device='cuda:{}'.format(args.gpu_id))

This approach requires reloading the entire model. I am looking for a way to manually reset the model's memory state for a new video without having to rerun the entire initialization block.

Thank you for your attention!

yzy-bupt commented 3 months ago

Same problem as you.

Espere-1119-Song commented 3 months ago

do you mean change the length of short-term memory or long-term memory?