-
Hi, thank you so much for the great works! I have questions about sampled frame number, in the paper mentioned
> During inference, we uniformly sample 6 frames with center crop.
I am keen to kn…
-
hello, as i want to use demo_vid2seq.py to get video captioning, there are many questions which i don't understand, first, when i run demo_vid2seq.py, there is an error:
load Vid2Seq model
Traceba…
tickm updated
11 months ago
-
Hi @xingyizhou,@a-nagrani and @antoyang,
I'm writing to you because I'm interested in using the Vid2Seq model for dense captioning and video captioning on a few educational videos which are MP4 fil…
-
# TODO
* [ ] Close this issue when the **great fork merge** happens.
# Original comment
Like the thread in the [other repo](https://github.com/ioccc-src/mkiocccentry/issues/171) this is to he…
xexyl updated
10 hours ago
-
### Feature request
Implement a feature using Langchain's image_captions.py and audio_speech_to_text.py to produce .srt files. This system will provide both subtitles and visual scene descriptions, e…
-
Hello, thank you for sharing your code. Can you help me in this scenario:
For a video captioning model, I have sampled each video with 16 frames. I've employed a Video Swin Transformer to extract vid…
-
**System information**
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): "18.04.1 LTS…
-
hi!
may i know how to do the inference without speech?
I've set the --no_speech but so that the output is [].
And when i do inference in activitynet and charades dataset, the output looks like it…
-
Was trying to load a workflow discussed [here ](https://www.youtube.com/watch?v=qW1I7in1WL0&t=236s )after fiddling w/ a few others when I started getting a pop-up error that comes right back if clicke…
CCpt5 updated
9 months ago
-
### Feature request
I'd like to request for the ability to stream back chunks of audio transcripts instead of having to wait for the entire audio to be processed. For real time use cases, it helps to…