Distributed inference example for llava_next

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

https://huggingface.co/docs/accelerate

Apache License 2.0

7.98k stars 975 forks source link

Distributed inference example for llava_next #3179

Open VladOS95-cyber opened 1 month ago

VladOS95-cyber commented 1 month ago

Add distributed inference example for LLaVA-NeXT-Video-7B-hf

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[x] Did you read the contributor guideline, Pull Request section?
[x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. Link: https://github.com/huggingface/accelerate/issues/3078
[x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
[ ] Did you write any new necessary tests?

Who can review?

@sayakpaul @a-r-r-o-w @muellerzr

VladOS95-cyber commented 1 month ago

Hi @sayakpaul @a-r-r-o-w! This PR is ready for review, please, take a look.

HuggingFaceDocBuilderDev commented 1 month ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

VladOS95-cyber commented 2 weeks ago

Hey @sayakpaul! Please, take a look on recent changes. Does it look ok to you?

sayakpaul commented 1 day ago

@muellerzr I think this is now ready for your review.

VladOS95-cyber commented 1 day ago

Thanks for the updates, I left some minor comments. I think this is close to merge.

Could you additionally show some sample outputs (i.e., video - generated caption pairs)?

@sayakpaul Thank you for your review! Do you mean by providing some examples in description to the script? Just in case, as example: { "prompt": "USER: <video>\nGenerate caption ASSISTANT:", "video": "/Users/{user_name}/.cache/huggingface/hub/datasets--malterei--LLaVA-Video-small-swift/snapshots/7d712657d4907c4f29c2c6fa17afaa7289a362a7/videos/4255049031.mp4", "generated_text": [ "USER: \nGenerate caption ASSISTANT: \"Practicing his moves: A man in a white shirt and jeans demonstrates his agility and balance on a basketball court, with a water bottle nearby, ready for hydration.\"" ] }