Closed FurkanGozukara closed 10 months ago
the error looks like downloading file part : File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\resources\download.py", line 50, in auto_create
after deleting C:\Users\King\.triton\
still this error
[2024-01-12 00:03:13,644] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-12 00:03:13,743] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs.
Please build and install Nvidia apex package with option '--cuda_ext' according to https://github.com/NVIDIA/apex#from-source .
bin G:\CogVLM\CogVLM\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda121.dll
Exception ignored in: <function BaseFileLock.__del__ at 0x000001A1416DF2E0>
Traceback (most recent call last):
File "G:\CogVLM\CogVLM\venv\lib\site-packages\filelock\_api.py", line 240, in __del__
self.release(force=True)
File "G:\CogVLM\CogVLM\venv\lib\site-packages\filelock\_api.py", line 201, in release
with self._thread_lock:
AttributeError: 'WindowsFileLock' object has no attribute '_thread_lock'
Traceback (most recent call last):
File "G:\CogVLM\CogVLM\basic_demo\web_demo.py", line 234, in <module>
main(args)
File "G:\CogVLM\CogVLM\basic_demo\web_demo.py", line 165, in main
model, image_processor, cross_image_processor, text_processor_infer = load_model(args)
File "G:\CogVLM\CogVLM\basic_demo\web_demo.py", line 65, in load_model
model, model_args = AutoModel.from_pretrained(
File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\model\base_model.py", line 337, in from_pretrained
return cls.from_pretrained_base(name, args=args, home_path=home_path, url=url, prefix=prefix, build_only=build_only, overwrite_args=overwrite_args, **kwargs)
File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\model\base_model.py", line 312, in from_pretrained_base
model_path = auto_create(name, path=home_path, url=url)
File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\resources\download.py", line 50, in auto_create
lock = FileLock(model_path + '.lock', mode=0o777)
TypeError: BaseFileLock.__init__() got an unexpected keyword argument 'mode'
Press any key to continue . . .
ok i made it work with a custom app
loading as 4 bit right now
are there any disadvantages or how much quality we lose with 4bit or 8bit compared to 16bit?
One another question I have is
I am working on a tutorial for using cogagent-vqa for image captioning for Stable Diffusion training.
Could you let me know optimal parameters for captioning images? it will be 1 question and answer
Also is cogagent-vqa best for this? do_sample : false or true? top_p : what value? top_k : what value? temperature : what value?
can you tell me default values to set them?
For Question:
The answer is:
For your second question, I can answer this:
-vqa
is not suitable for this requirement image captioning for Stable Diffusion training
. It is designed for image understanding and question and answer dialogue functions.For the default parameters, they are set according to the default values in our online experience address: do_sample = false temperature=0.9 top_k = 2
If you want to set do_sample = True Then top_p = 0.8
By the way, any models with -chat(for example cogvlm-chat
) are Image understanding chat models. If you are using the last chat model, you will not have any text to image functionality.
By the way, any models with -chat(for example
cogvlm-chat
) are Image understanding chat models. If you are using the last chat model, you will not have any text to image functionality.
yes i know. i am looking for best image to text model. understanding image.
is cogagent-vqa is the best model to understand image? thank you
If you are looking for an open source, high-resolution image understanding model, then Cogagent will be a very good choice currently.
@zRzRzRzRzRzRzR if you can verify this i appreciate a lot
Currently I am using THUDM/cogagent-vqa-hf model
This is being the best image understanding model am i right?
I used the example code you provided on Hugging Face
MODEL_PATH="THUDM/cogagent-vqa-hf"
tokenizer = LlamaTokenizer.from_pretrained('lmsys/vicuna-7b-v1.5')
def post(input_text, temperature, top_p, top_k, image_prompt, do_sample):
try:
with torch.no_grad():
image = Image.open(image_prompt).convert('RGB') if image_prompt is not None else None
input_by_model = model.build_conversation_input_ids(tokenizer, query=input_text, history=[], images=([image] if image else None), template_version='base')
inputs = {
'input_ids': input_by_model['input_ids'].unsqueeze(0).to(DEVICE),
'token_type_ids': input_by_model['token_type_ids'].unsqueeze(0).to(DEVICE),
'attention_mask': input_by_model['attention_mask'].unsqueeze(0).to(DEVICE),
'images': [[input_by_model['images'][0].to(DEVICE).to(torch_type)]],
}
if 'cross_images' in input_by_model and input_by_model['cross_images']:
inputs['cross_images'] = [[input_by_model['cross_images'][0].to(DEVICE).to(torch_type)]]
gen_kwargs = {
"max_length": 2048,
"temperature": temperature,
"do_sample": do_sample,
"top_p": top_p,
"top_k": top_k
}
outputs = model.generate(**inputs, **gen_kwargs)
outputs = outputs[:, inputs['input_ids'].shape[1]:]
response = tokenizer.decode(outputs[0])
response = response.split("</s>")[0]
return response
except Exception as e:
return str(e)
This is best if you are just doing vqa, not continuous dialogue, if you want to use dialogue you should use this model https://huggingface.co/THUDM/cogagent-chat-hf
This is best if you are just doing vqa, not continuous dialogue, if you want to use dialogue you should use this model https://huggingface.co/THUDM/cogagent-chat-hf
I am doing 0 shot no continuous chat. Thank you.
Hello. I have installed every one of the libraries. Including Triton and DeepSpeed
But when I start the demo gradio app I get the below error how to fix?
How I start
python basic_demo\web_demo.py --from_pretrained cogagent-vqa --version chat_old --bf16
The error
pip freeze below
Who can help? / 谁可以帮助到您?
@zRzRzRzRzRzRzR @1049451037 @wenyihong @mactavish91 @lykeven @duchenzhuang
Information / 问题信息