-
Thank you for the easy-to-follow code, but I have some questions about the differences between "Teacher Forcing" vs "Diffusion Forcing" at inference (denosing) time.
After I investigated and prin…
-
Hi author,
It seems like SAMPLING_FRAME_NUM decides the synthetic video length that fed to model.
what this value shoule be during training if I want to evaluate model on 100 frames video?
Do you…
-
What is the Qwen2-VL Max HF Demo config?
https://huggingface.co/spaces/Qwen/Qwen2-VL
In the demo from this repo, i found the setup for 7B, but is Qwen2-VL-Max the same?
Could someone please prov…
-
Thank you for your interesting work!
For Qwen2-VL-72B model, are you using the whole video or just sample 48 frames per video? If you used the whole video, what's the fps?
-
我在8*80g a100上运行下面的代码出现了OOM问题,
accelerate launch --num_processes 8 --main_process_port 12345 -m lmms_eval \
--model qwen2_vl \
--model_args pretrained="/share/huaying/pretrained_model/Qwe…
-
## 🚀 Feature Request
I'm trying to add support for sub-sampling long videos in StreamingDataset. I have two possible implementation methods in mind, and I'd like your feedback on which is most inli…
-
-
Hi,
I have a serial of images which from a low sampling (`2 frames/second`) from a video , and I want the model can help to recoginze a behavior from these image frames, the behavior is a time se…
-
Currently there is absolutely no error handling. We need to add some. If the websocket server crashes, it takes down the video_capture interface as well. I think some trapping around the ws.send() cal…
txoof updated
8 years ago
-
Here is a minimal repro case.
We'll start with something that works so we can keep it as a reference. This basic workflow below generates a video of a running person (SparseCtrl-RGB module is bypas…