Closed FurkanGozukara closed 10 months ago
the error looks like downloading file part : File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\resources\", line 50, in auto_create
after deleting C:\Users\King\.triton\
still this error
[2024-01-12 00:03:13,644] [INFO] [] Setting ds_accelerator to cuda (auto detect)
[2024-01-12 00:03:13,743] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs.
Please build and install Nvidia apex package with option '--cuda_ext' according to .
bin G:\CogVLM\CogVLM\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda121.dll
Exception ignored in: <function BaseFileLock.__del__ at 0x000001A1416DF2E0>
Traceback (most recent call last):
File "G:\CogVLM\CogVLM\venv\lib\site-packages\filelock\", line 240, in __del__
File "G:\CogVLM\CogVLM\venv\lib\site-packages\filelock\", line 201, in release
with self._thread_lock:
AttributeError: 'WindowsFileLock' object has no attribute '_thread_lock'
Traceback (most recent call last):
File "G:\CogVLM\CogVLM\basic_demo\", line 234, in <module>
File "G:\CogVLM\CogVLM\basic_demo\", line 165, in main
model, image_processor, cross_image_processor, text_processor_infer = load_model(args)
File "G:\CogVLM\CogVLM\basic_demo\", line 65, in load_model
model, model_args = AutoModel.from_pretrained(
File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\model\", line 337, in from_pretrained
return cls.from_pretrained_base(name, args=args, home_path=home_path, url=url, prefix=prefix, build_only=build_only, overwrite_args=overwrite_args, **kwargs)
File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\model\", line 312, in from_pretrained_base
model_path = auto_create(name, path=home_path, url=url)
File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\resources\", line 50, in auto_create
lock = FileLock(model_path + '.lock', mode=0o777)
TypeError: BaseFileLock.__init__() got an unexpected keyword argument 'mode'
Press any key to continue . . .
ok i made it work with a custom app
loading as 4 bit right now
are there any disadvantages or how much quality we lose with 4bit or 8bit compared to 16bit?
One another question I have is
I am working on a tutorial for using cogagent-vqa for image captioning for Stable Diffusion training.
Could you let me know optimal parameters for captioning images? it will be 1 question and answer
Also is cogagent-vqa best for this? do_sample : false or true? top_p : what value? top_k : what value? temperature : what value?
can you tell me default values to set them?
For Question:
The answer is:
For your second question, I can answer this:
is not suitable for this requirement image captioning for Stable Diffusion training
. It is designed for image understanding and question and answer dialogue functions.For the default parameters, they are set according to the default values in our online experience address: do_sample = false temperature=0.9 top_k = 2
If you want to set do_sample = True Then top_p = 0.8
By the way, any models with -chat(for example cogvlm-chat
) are Image understanding chat models. If you are using the last chat model, you will not have any text to image functionality.
By the way, any models with -chat(for example
) are Image understanding chat models. If you are using the last chat model, you will not have any text to image functionality.
yes i know. i am looking for best image to text model. understanding image.
is cogagent-vqa is the best model to understand image? thank you
If you are looking for an open source, high-resolution image understanding model, then Cogagent will be a very good choice currently.
@zRzRzRzRzRzRzR if you can verify this i appreciate a lot
Currently I am using THUDM/cogagent-vqa-hf model
This is being the best image understanding model am i right?
I used the example code you provided on Hugging Face
tokenizer = LlamaTokenizer.from_pretrained('lmsys/vicuna-7b-v1.5')
def post(input_text, temperature, top_p, top_k, image_prompt, do_sample):
with torch.no_grad():
image ='RGB') if image_prompt is not None else None
input_by_model = model.build_conversation_input_ids(tokenizer, query=input_text, history=[], images=([image] if image else None), template_version='base')
inputs = {
'input_ids': input_by_model['input_ids'].unsqueeze(0).to(DEVICE),
'token_type_ids': input_by_model['token_type_ids'].unsqueeze(0).to(DEVICE),
'attention_mask': input_by_model['attention_mask'].unsqueeze(0).to(DEVICE),
'images': [[input_by_model['images'][0].to(DEVICE).to(torch_type)]],
if 'cross_images' in input_by_model and input_by_model['cross_images']:
inputs['cross_images'] = [[input_by_model['cross_images'][0].to(DEVICE).to(torch_type)]]
gen_kwargs = {
"max_length": 2048,
"temperature": temperature,
"do_sample": do_sample,
"top_p": top_p,
"top_k": top_k
outputs = model.generate(**inputs, **gen_kwargs)
outputs = outputs[:, inputs['input_ids'].shape[1]:]
response = tokenizer.decode(outputs[0])
response = response.split("</s>")[0]
return response
except Exception as e:
return str(e)
This is best if you are just doing vqa, not continuous dialogue, if you want to use dialogue you should use this model
This is best if you are just doing vqa, not continuous dialogue, if you want to use dialogue you should use this model
I am doing 0 shot no continuous chat. Thank you.
Hello. I have installed every one of the libraries. Including Triton and DeepSpeed
But when I start the demo gradio app I get the below error how to fix?
How I start
python basic_demo\ --from_pretrained cogagent-vqa --version chat_old --bf16
The error
pip freeze below
Who can help? / 谁可以帮助到您?
@zRzRzRzRzRzRzR @1049451037 @wenyihong @mactavish91 @lykeven @duchenzhuang
Information / 问题信息