jhc13 / taggui

Tag manager and captioner for image datasets
GNU General Public License v3.0
617 stars 28 forks source link

THUDM/cogagent-vqa-hf not working #100

Closed ChristianMayer closed 4 months ago

ChristianMayer commented 5 months ago

Updating taggui to the latest state of the main branch and also making sure that requirements.txt are installed I'm now trying to use THUDM/cogagent-vqa-hf but I get this message:

Loading THUDM/cogagent-vqa-hf...
Traceback (most recent call last):
File "/home/cm/devel/StableDiffusion/taggui/taggui/auto_captioning/captioning_thread.py", line 308, in run
processor, model = self.load_processor_and_model(device, model_type)
File "/home/cm/devel/StableDiffusion/taggui/taggui/auto_captioning/captioning_thread.py", line 146, in load_processor_and_model
processor = LlamaTokenizer.from_pretrained('lmsys/vicuna-7b-v1.5')
File "/home/cm/devel/StableDiffusion/taggui/venv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2086, in from_pretrained
return cls._from_pretrained(
File "/home/cm/devel/StableDiffusion/taggui/venv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2325, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/cm/devel/StableDiffusion/taggui/venv/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 182, in __init__
self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False))
File "/home/cm/devel/StableDiffusion/taggui/venv/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 214, in get_spm_processor
model_pb2 = import_protobuf(f"The new behaviour of {self.__class__.__name__} (with `self.legacy = False`)")
File "/home/cm/devel/StableDiffusion/taggui/venv/lib/python3.10/site-packages/transformers/convert_slow_tokenizer.py", line 40, in import_protobuf
from transformers.utils import sentencepiece_model_pb2_new as sentencepiece_model_pb2
File "/home/cm/devel/StableDiffusion/taggui/venv/lib/python3.10/site-packages/transformers/utils/sentencepiece_model_pb2_new.py", line 16, in <module>
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(
TypeError
:
Couldn't build proto file into descriptor pool: Invalid default '0.9995' for field sentencepiece.TrainerSpec.character_coverage of type 2

Note: trying THUDM/cogvlm-chat-hf is also failing and the error message looks similar (but I didn't compare it word by word)

jhc13 commented 5 months ago

I am away from home right now, so I will check when I get back. In the meantime, could you try completely deleting and recreating your virtual environment containing the dependencies? There was a similar issue that was fixed by doing this (#61).

jhc13 commented 4 months ago

I checked and it's working fine for me.

Try recreating the virtual environment as I said, and if it still does not work, you can try this fix which seems to have solved the problem for many people.

ChristianMayer commented 4 months ago

No, neither recreating the venv nor the reinstallation of the protobuf made a difference.

But I wander that it doesn't want to (re-)download the THUDM/cogvlm-chat-hf as I deleted it (I think/hope) completely as I think my first download was interrupted. At least in ~/.cache/huggingface/ the model doesn't appear.

jhc13 commented 4 months ago

If manual installation isn't working for you, you can try downloading the latest bundled release instead.

basvandenbroek commented 4 months ago

Downgrading protobuf to 3.20.3 solved this issue for me.

ChristianMayer commented 4 months ago

@basvandenbroek thank you for the hint. After I downgraded protobuf to 3.20.3 taggui does download THUDM/cogvlm-chat-hf and I get a prompt :+1: