Open VladOS95-cyber opened 1 month ago
Hi @sayakpaul @a-r-r-o-w! This PR is ready for review, please, take a look.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Hey @sayakpaul! Please, take a look on recent changes. Does it look ok to you?
@muellerzr I think this is now ready for your review.
Thanks for the updates, I left some minor comments. I think this is close to merge.
Could you additionally show some sample outputs (i.e., video - generated caption pairs)?
@sayakpaul Thank you for your review! Do you mean by providing some examples in description to the script? Just in case, as example:
{ "prompt": "USER: <video>\nGenerate caption ASSISTANT:", "video": "/Users/{user_name}/.cache/huggingface/hub/datasets--malterei--LLaVA-Video-small-swift/snapshots/7d712657d4907c4f29c2c6fa17afaa7289a362a7/videos/4255049031.mp4", "generated_text": [ "USER: \nGenerate caption ASSISTANT: \"Practicing his moves: A man in a white shirt and jeans demonstrates his agility and balance on a basketball court, with a water bottle nearby, ready for hydration.\"" ] }
Add distributed inference example for LLaVA-NeXT-Video-7B-hf
Before submitting
Who can review?
@sayakpaul @a-r-r-o-w @muellerzr