TonyLianLong / LLM-groundedVideoDiffusion

[ICLR 2024] LLM-grounded Video Diffusion Models (LVD): official implementation for the LVD paper
https://llm-grounded-video-diffusion.github.io/
127 stars 7 forks source link

when will you release code? #1

Open jianlong-yuan opened 8 months ago

TonyLianLong commented 8 months ago

Working on the camera ready paper. We will release our code at about the same time.

Bailey-24 commented 8 months ago

插个眼

guyuchao commented 7 months ago

Hi, is there any update?

TonyLianLong commented 7 months ago

We are still organizing the code repo, which will include both the LLM text-to-DSL part and DSL-to-video part based on cross-attention control. However, similar to LMD+, we offer a simple custom pipeline that uses video gligen adapters that we trained on our own to condition ModelScope.

Here is a colab that uses the model: https://colab.research.google.com/drive/17He4bFAF8lXmT9Nfv-Sg29iKtPelDUNZ

We plan to also add the LLM part to colab soon.

The text-to-DSL part is straightforward. You can use ChatGPT website for it. The prompt is in the paper appendix.

Example:

Prompt: An image of grassland with a dog walking from the left to the right. download

blE-lj commented 7 months ago

Hi, the work is very impressive and I'm looking forward to seeing your code, can I be informed of code release time?

TonyLianLong commented 6 months ago

@jianlong-yuan @Bailey-24 @guyuchao @blE-lj Thanks for your interest! The code and the benchmark have been released. Feel free to ask any questions.