antoyang / VidChapters

[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
http://arxiv.org/abs/2309.13952
MIT License
174 stars 21 forks source link

tokenizer in demo_vid2seq.py #9

Closed tickm closed 11 months ago

tickm commented 1 year ago

hello, as i want to use demo_vid2seq.py to get video captioning, there are many questions which i don't understand, first, when i run demo_vid2seq.py, there is an error:

load Vid2Seq model Traceback (most recent call last): File "demo_vid2seq.py", line 55, in tokenizer = _get_tokenizer(args.model_name, args.num_bins) File "/root/tzp/codes/VidChapters-main/model/vid2seq.py", line 12, in _get_tokenizer tokenizer = T5Tokenizer.from_pretrained(tokenizer_path, local_files_only=True) File "/root/.local/conda/envs/vidchapter/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1796, in from_pretrained f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from " OSError: Can't load tokenizer for 't5-base'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 't5-base' is the correct path to a directory containing all relevant files for a T5Tokenizer tokenizer.

i don't know how to set and get the tokenizer, can you help me?

antoyang commented 1 year ago

This error happens when you do not load the tokenizer properly, so make sure you have it downloaded and that the TRANSFORMERS_CACHE environment variable is set accordingly.

fazliimam commented 1 year ago

I'm getting a similar error when I try to run demo_vid2seq.py: OSError: t5-base does not appear to have a file named pytorch_model.bin but there is a file for TensorFlow weights. Usefrom_tf=Trueto load this model from those weights. @tickm did you manage to solve your error

HoldenHuan9 commented 1 year ago

@tickm do you know how to set and get the tokenizer yet? I also have this problem

YoucanBaby commented 1 year ago

@tickm I successfully ran the code!

First, your need comment this line.

Then, create a new file named download_t5.py and write the following content:

from transformers import pipeline pipe = pipeline("translation", model="t5-base")

run python download_t5.py , transformers will download t5 into /user/.cache/huggingface/models/models--t5-base

Finally, run demo_vid2seq.py and transformers will use cached t5.