Closed sjghh closed 1 week ago
Hello,I'm a phD student from ZJU, I also use videollama2 to do my own research,we create a WeChat group to discuss some issues of videollama2,could you join us? Please contact me: WeChat number == LiangMeng19357260600, phone number == +86 19357260600,e-mail == liangmeng89@zju.edu.cn.
Hi, you can follow the settings of the code below to name the selected data file.
Thank you again for your response, but I still have three questions:
video_path
with audio, must this data be labeled as stage3
to proceed with joint training?If there is only one video_path with audio, this data does not have to be marked as stage3 for joint training. For videos with audio, the model will automatically extract the audio to optimize the audio/video projector and audio encoder. The extracted audio will go through the audio branch, and the video will go through the video branch.
Thank you again for your response. I have another question: If I have only one .json file, can it include both image and video formats, or do I need to modify the JSON for image to follow the stage2 format?
@xinyifei99 If I find that I can only process video files when there is a single .json
file, how can I train both videos and images simultaneously? Thank you for taking the time to answer!
I noticed that the AV version of the inference script does not include examples for image inference. Does this mean it cannot perform image inference?
Hello, I encountered an issue while fine-tuning using
train()
File "/data/hongbo.xu/Datasets/MC-ERU/Video-llama2/VideoLLaMA2-audio_visual/videollama2/train.py", line 660, in train
data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args)
File "/data/hongbo.xu/Datasets/MC-ERU/Video-llama2/VideoLLaMA2-audio_visual/videollama2/train.py", line 433, in make_supervised_data_module
train_dataset = LazySupervisedDataset(
File "/data/hongbo.xu/Datasets/MC-ERU/Video-llama2/VideoLLaMA2-audio_visual/videollama2/train.py", line 274, in init
raise NotImplementedError
NotImplementedError
/scripts/custom/va_joint.sh
. I used two files,video.json
andaudio.json
, like this:--data_path ${DATA_DIR}/stage3_video_audio.json,${DATA_DIR}/stage2_audio_subset_new.json
. The problem occurred, but when I used onlyvideo.json
, the issue did not appear. I suspect that it's necessary to use three.json
files, for example:--data_path ${DATA_DIR}/stage3_video_audio.json,${DATA_DIR}/stage2_audio_subset_new.json,${DATA_DIR}/stage2_video_subset.json
. Traceback (most recent call last): File "/data/hongbo.xu/Datasets/MC-ERU/Video-llama2/VideoLLaMA2-audio_visual/videollama2/train.py", line 683, inThank you for your help amidst your busy schedule!