Open zRzRzRzRzRzRzR opened 1 week ago
同问,目前不好复现,
Hi ZR,
Thanks for your interest!
The repository does not seem to provide example prompt requirements, such as the language and how to structure the length properly. By reading the code, I gathered the following information: No negative_prompt The length of the input prompt should be < 77 Tokens (CLIP) The input must be in English.
Regarding the video length, we apologize that the current open-source version only supports videos shorter than 10 seconds. The timeline for open-sourcing additional versions, including the I2V model, is still undecided.
@zRzRzRzRzRzRzR could you kindly share your full steps from start to end as how you created this video? How did you download checkpoints , what library versions did you use etc as I am trying to follow their READ ME but its incomplete. I am getting following error:
from models.VchitectXL import VchitectXLTransformerModel
File "/home/Ubuntu/Vchitect-2.0/models/VchitectXL.py", line 34, in
Regarding the video length, we apologize that the current open-source version only supports videos shorter than 10 seconds. The timeline for open-sourcing additional versions, including the I2V model, is still undecided.
Hi, Just to clarify, the examples from the webpage were generated using a different model than the open-source model? Thanks
@zRzRzRzRzRzRzR could you kindly share your full steps from start to end as how you created this video? How did you download checkpoints , what library versions did you use etc as I am trying to follow their READ ME but its incomplete. I am getting following error:您能否友好地分享您从头到尾创建这个视频的完整步骤?您是如何下载检查点的,使用了哪些库版本等,因为我正在尝试按照他们的 README 操作,但它是残缺的。我遇到了以下错误:
from models.VchitectXL import VchitectXLTransformerModel从 models.VchitectXL 导入 VchitectXLTransformerModel File "/home/Ubuntu/Vchitect-2.0/models/VchitectXL.py", line 34, in 文件 "/home/Ubuntu/Vchitect-2.0/models/VchitectXL.py",第 34 行,在 from torch.distributed.tensor.parallel import (从 torch.distributed.tensor.parallel 导入 ( ImportError: cannot import name 'PrepareModuleOutput' from 'torch.distributed.tensor.parallel' (/home/Ubuntu/miniconda3/envs/VchitectXL/lib/python3.11/site-packages/torch/distributed/tensor/parallel/init.py)ImportError: 无法从 'torch.distributed.tensor.parallel' 导入名称 'PrepareModuleOutput' (/home/Ubuntu/miniconda3/envs/VchitectXL/lib/python3.11/site-packages/torch/distributed/tensor/parallel/init.py)
I wll upload later
Step 1 test.txt has only one line
A little girl is riding a bicycle at high speed. Focused, detailed, realistic.
Step 2, modify the code inference.py
:
def infer(args):
pipe = VchitectXLPipeline(args.ckpt_path)
idx = 0
Change to
def infer(args):
pipe = VchitectXLPipeline(args.ckpt_path,device="cuda")
idx = 0
Step 3, if you want to change the number of frames of the video generation length:
with torch.cuda.amp.autocast(dtype=torch.bfloat16):
video = pipe(
prompt,
negative_prompt="",
num_inference_steps=50,
guidance_scale=7.5,
width=768,
height=432, #480x288 624x352 432x240 768x432
frames=10*8, #Change here, seconds*frames (default is 8 frames)
)
Step 4, run the program:
CUDA_VISIBLE_DEVICES=8 python inference.py --test_file test.txt --save_dir output --ckpt_path Vchitect-XL-2B (the absolute path of the model you downloaded)
This will run, I believe it will help you.
Hi ZR,
Thanks for your interest!
The repository does not seem to provide example prompt requirements, such as the language and how to structure the length properly. By reading the code, I gathered the following information: No negative_prompt The length of the input prompt should be < 77 Tokens (CLIP) The input must be in English.
* We will add more info to README ASAP. * negative prompt is supported; * The length of the input prompt can be greater than 77 tokens (T5 can accept longer prompts, CLIP cannot but is fine); * Yes, the input must be English;
Regarding the video length, we apologize that the current open-source version only supports videos shorter than 10 seconds. The timeline for open-sourcing additional versions, including the I2V model, is still undecided. How can I change the frame rate? The video currently generates at 8 frames per second.
Dear Development Team,
Hello, I have successfully installed the model and run it according to the requirements in the README, but I have encountered some issues and look forward to your response.
negative_prompt
In this case, I am unsure how to structure the prompt, so I simply wrote a prompt:
and set the seed to 42:
I set the output to 720x480 according to the README, and configured it as follows:
It occupied 67904MiB of GPU memory. The other parameters remained unchanged, with 50 sampling steps. The final video can be found here:
https://github.com/user-attachments/assets/61e84c11-490e-42ab-9703-3d78f7c1119a
Is this the expected result?
I did not see any relevant details about I2V in the code, nor any place where an image can be used as input. Should I understand that this open-source model is a T2V model?
It seems that there is no parameter to control the frame rate.
However, the video I generated only has 8 frames, with a total of 40 frames, as verified using the following command:
Is it because the open-source model only outputs 8 frames?
Additionally, there may be some issues in the code within the repository:
https://github.com/Vchitect/Vchitect-2.0/blob/0ef47a5d3368a1a7b235d7c3511be24f5febf791/models/pipeline.py#L198 This should be modified to
device = "cuda"
, or adddevice = "cuda"
in: https://github.com/Vchitect/Vchitect-2.0/blob/0ef47a5d3368a1a7b235d7c3511be24f5febf791/inference.py#L15 Otherwise, a tensor not on the same device error will occur during pos embed.https://github.com/Vchitect/Vchitect-2.0/tree/master/models/__pycache__ Should this be deleted? It seems unnecessary.
Looking forward to your response.