Picsart-AI-Research / StreamingT2V

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

https://streamingt2v.github.io/

1.09k stars 99 forks source link

Must initialize from damo-vilab/text-to-video-ms-1.7b? #17

Open Sutongtong233 opened 2 months ago

Sutongtong233 commented 2 months ago

Hi, I found that here: https://github.com/Picsart-AI-Research/StreamingT2V/blame/c1b8068bcbcdbbfa0dd0df3371d3c93a1f5132de/t2v_enhanced/model_init.py#L71C7-L71C7

def init_streamingt2v_model(ckpt_file, result_fol):
       ...
      cli = CustomCLI(VideoLDM)

It seems that VideoLDM must initialized from damo-vilab/text-to-video-ms-1.7b (in config pipeline_repo: damo-vilab/text-to-video-ms-1.7b). Moreover, ckpt_file, which is the path to Stream_t2v.ckpt, just add to sys.argv, but not used in any function. I am confused about this piece of code and hope to get your explanation, thanks :)

hpoghos commented 2 months ago

Hi @Sutongtong233, The streamingt2v model is loaded at line 105, you can check that it indeed uses the streaming_t2v.ckpt

zeroCAY commented 2 months ago

I also have the question, I have the streaming_t2v.ckpt actually, but it still ask to download /laion/CLIP-ViT-H-14-laion2B-s32B-b79K from huggingface. Because my server can't connect to huggingface for direct download, is this download behavior necessary? Is it possible to run the whole process with just streaming_t2v.ckpt ?

Mike001-wq commented 2 months ago

I also have the question, I have the streaming_t2v.ckpt actually, but it still ask to download /laion/CLIP-ViT-H-14-laion2B-s32B-b79K from huggingface. Because my server can't connect to huggingface for direct download, is this download behavior necessary? Is it possible to run the whole process with just streaming_t2v.ckpt ?

I solve this problem by change imageembedder.py Line 75: model, , _ = open_clip.create_model_and_transforms( arch, device=torch.device("cpu"),

pretrained=version,

        pretrained="./damo-vilab/open_clip_pytorch_model.bin"
    )

pretrained parameter points to your file's path, which is downloaded from huggingface.

zeroCAY commented 2 months ago

I also have the question, I have the streaming_t2v.ckpt actually, but it still ask to download /laion/CLIP-ViT-H-14-laion2B-s32B-b79K from huggingface. Because my server can't connect to huggingface for direct download, is this download behavior necessary? Is it possible to run the whole process with just streaming_t2v.ckpt ?

I solve this problem by change imageembedder.py Line 75: model, , _ = open_clip.create_model_and_transforms( arch, device=torch.device("cpu"), # pretrained=version, pretrained="./damo-vilab/open_clip_pytorch_model.bin" ) pretrained parameter points to your file's path, which is downloaded from huggingface.

thanks，I solve the problem by using your method~

kunkun-zhu commented 2 months ago

Hi @Sutongtong233 ， Have you solved the problem？ I also have the question。“OSError: Cannot load model damo-vilab/text-to-video-ms-1.7b: model is not cached locally”，but I can't connect to huggingface.

ffhelly commented 1 month ago

嗨，你解决了问题吗？我也有问题。“OSError：无法加载模型 damo-vilab/text-to-video-ms-1.7b：模型未缓存本地”，但我无法连接到 huggingface。

从镜像站拉一份到本地。或者开科技下载