HeliosZhao / Animate124

Animate124: Animating One Image to 4D Dynamic Scene
Apache License 2.0
154 stars 5 forks source link

error on running step 2 - Can't load tokenizer for 'damo-vilab/text-to-video-ms-1.7b' #2

Closed andytriboletti closed 5 months ago

andytriboletti commented 5 months ago

run: bash teststep2.sh

more teststep2.sh seed=0 gpu=0 exp_root_dir=outputs DATA_DIR="panda-dance" STATIC_PROMPT="a high resolution DSLR image of panda" DYNAMIC_PROMPT="a panda is dancing" CN_PROMPT="a is dancing" lambda_sd_img=0.01

--------- Stage 2 (Dynamic Coarse Stage) ---------

ckpt=outputs/animate124-stage1/${STATIC_PROMPT}@LAST/ckpts/last.ckpt python launch.py --config custom/threestudio-animate124/configs/animate124-stage2-ms.yaml --train --gpu $gpu \ data.image.image_path=custom/threestudio-animate124/load/${DATA_DIR}/_rgba.png \ system.prompt_processor.prompt="${DYNAMIC_PROMPT}" \ system.weights="$ckpt"

error:

Seed set to 0 [INFO] Loading Stable Diffusion ... model_index.json: 100%|████████████████████████████████████████████████████████████████| 384/384 [00:00<00:00, 4.08MB/s] text_encoder/config.json: 100%|████████████████████████████████████████████████████████| 644/644 [00:00<00:00, 2.26MB/s] scheduler/scheduler_config.json: 100%|█████████████████████████████████████████████████| 465/465 [00:00<00:00, 1.74MB/s] unet/config.json: 100%|████████████████████████████████████████████████████████████████| 787/787 [00:00<00:00, 8.42MB/s] vae/config.json: 100%|█████████████████████████████████████████████████████████████████| 657/657 [00:00<00:00, 8.67MB/s] diffusion_pytorch_model.safetensors: 100%|███████████████████████████████████████████| 335M/335M [00:50<00:00, 6.60MB/s] model.safetensors: 100%|███████████████████████████████████████████████████████████| 1.36G/1.36G [02:36<00:00, 8.72MB/s] diffusion_pytorch_model.safetensors: 100%|█████████████████████████████████████████| 5.65G/5.65G [06:45<00:00, 13.9MB/s] Fetching 8 files: 100%|███████████████████████████████████████████████████████████████████| 8/8 [06:46<00:00, 50.84s/it] Loading pipeline components...: 100%|█████████████████████████████████████████████████████| 4/4 [00:45<00:00, 11.34s/it] Traceback (most recent call last):s: 100%|█████████████████████████████████████████| 5.65G/5.65G [06:45<00:00, 18.6MB/s] File "/home/andy/threestudio/launch.py", line 301, in main(args, extras) File "/home/andy/threestudio/launch.py", line 169, in main system: BaseSystem = threestudio.find(cfg.system_type)( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/andy/threestudio/custom/threestudio-animate124/systems/base.py", line 40, in init self.configure() File "/home/andy/threestudio/custom/threestudio-animate124/systems/animate124.py", line 63, in configure self.guidance_video = threestudio.find(self.cfg.guidance_type)( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/andy/threestudio/threestudio/utils/base.py", line 83, in init self.configure(*args, **kwargs) File "/home/andy/threestudio/custom/threestudio-animate124/models/guidance/zeroscope_guidance.py", line 70, in configure self.tokenizer = CLIPTokenizer.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/andy/miniconda3/envs/threestudio/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1795, in from_pretrained raise EnvironmentError( OSError: Can't load tokenizer for 'damo-vilab/text-to-video-ms-1.7b'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'damo-vilab/text-to-video-ms-1.7b' is the correct path to a directory containing all relevant files for a CLIPTokenizer tokenizer.

HeliosZhao commented 5 months ago

Hi, can you run the following code to see if there is any bug?

from transformers import CLIPTokenizer

tokenizer = CLIPTokenizer.from_pretrained("damo-vilab/text-to-video-ms-1.7b", subfolder="tokenizer")
andytriboletti commented 5 months ago

I logged into huggingface and set it up on command line and got past this error.