THUDM / CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型
Apache License 2.0
6.08k stars 415 forks source link

Gradio APP not working all libraries are installed Windows 10 Python 3.10.11 - AttributeError: 'WindowsFileLock' object has no attribute '_thread_lock' #313

Closed FurkanGozukara closed 10 months ago

FurkanGozukara commented 10 months ago

Hello. I have installed every one of the libraries. Including Triton and DeepSpeed

But when I start the demo gradio app I get the below error how to fix?

How I start

python basic_demo\web_demo.py --from_pretrained cogagent-vqa --version chat_old --bf16

The error

[2024-01-11 23:58:10,736] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-11 23:58:10,831] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs.
Please build and install Nvidia apex package with option '--cuda_ext' according to https://github.com/NVIDIA/apex#from-source .
bin G:\CogVLM\CogVLM\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda121.dll
Exception ignored in: <function BaseFileLock.__del__ at 0x0000028AFF1BF2E0>
Traceback (most recent call last):
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\filelock\_api.py", line 240, in __del__
    self.release(force=True)
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\filelock\_api.py", line 201, in release
    with self._thread_lock:
AttributeError: 'WindowsFileLock' object has no attribute '_thread_lock'
Traceback (most recent call last):
  File "G:\CogVLM\CogVLM\basic_demo\web_demo.py", line 234, in <module>
    main(args)
  File "G:\CogVLM\CogVLM\basic_demo\web_demo.py", line 165, in main
    model, image_processor, cross_image_processor, text_processor_infer = load_model(args)
  File "G:\CogVLM\CogVLM\basic_demo\web_demo.py", line 65, in load_model
    model, model_args = AutoModel.from_pretrained(
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\model\base_model.py", line 337, in from_pretrained
    return cls.from_pretrained_base(name, args=args, home_path=home_path, url=url, prefix=prefix, build_only=build_only, overwrite_args=overwrite_args, **kwargs)
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\model\base_model.py", line 312, in from_pretrained_base
    model_path = auto_create(name, path=home_path, url=url)
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\resources\download.py", line 50, in auto_create
    lock = FileLock(model_path + '.lock', mode=0o777)
TypeError: BaseFileLock.__init__() got an unexpected keyword argument 'mode'
Exception ignored in atexit callback: <function matmul_ext_update_autotune_table at 0x0000028A9750B2E0>
Traceback (most recent call last):
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\deepspeed\ops\transformer\inference\triton\matmul_ext.py", line 444, in matmul_ext_update_autotune_table
    fp16_matmul._update_autotune_table()
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\deepspeed\ops\transformer\inference\triton\matmul_ext.py", line 421, in _update_autotune_table
    TritonMatmul._update_autotune_table(__class__.__name__ + "_2d_kernel", __class__._2d_kernel)
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\deepspeed\ops\transformer\inference\triton\matmul_ext.py", line 150, in _update_autotune_table
    cache_manager.put(autotune_table)
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\deepspeed\ops\transformer\inference\triton\matmul_ext.py", line 69, in put
    os.rename(self.file_path + ".tmp", self.file_path)
FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'C:\\Users\\King\\.triton\\autotune\\Fp16Matmul_2d_kernel.pickle.tmp' -> 'C:\\Users\\King\\.triton\\autotune\\Fp16Matmul_2d_kernel.pickle'
Press any key to continue . . .

pip freeze below

Microsoft Windows [Version 10.0.19045.3803]
(c) Microsoft Corporation. All rights reserved.

G:\CogVLM\CogVLM\venv\Scripts>activate

(venv) G:\CogVLM\CogVLM\venv\Scripts>pip freeze
accelerate==0.26.1
aiofiles==23.2.1
aiohttp==3.9.1
aiosignal==1.3.1
altair==5.2.0
annotated-types==0.6.0
anyio==4.2.0
async-timeout==4.0.3
attrs==23.2.0
bitsandbytes @ https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl
blinker==1.7.0
blis==0.7.11
boto3==1.34.17
botocore==1.34.17
braceexpand==0.1.7
cachetools==5.3.2
catalogue==2.0.10
certifi==2022.12.7
charset-normalizer==2.1.1
click==8.1.7
cloudpathlib==0.16.0
colorama==0.4.6
confection==0.1.4
contourpy==1.2.0
cpm-kernels==1.0.11
cycler==0.12.1
cymem==2.0.8
datasets==2.16.1
deepspeed @ https://huggingface.co/MonsterMMORPG/SECourses/resolve/main/deepspeed-0.11.2_cuda121-cp310-cp310-win_amd64.whl
dill==0.3.7
einops==0.7.0
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl
exceptiongroup==1.2.0
fastapi==0.109.0
ffmpy==0.3.1
filelock==3.9.0
fonttools==4.47.2
frozenlist==1.4.1
fsspec==2023.10.0
gitdb==4.0.11
GitPython==3.1.41
gradio==4.14.0
gradio_client==0.8.0
h11==0.14.0
hjson==3.1.0
httpcore==1.0.2
httpx==0.26.0
huggingface-hub==0.20.2
idna==3.4
importlib-metadata==7.0.1
importlib-resources==6.1.1
Jinja2==3.1.2
jmespath==1.0.1
jsonlines==4.0.0
jsonschema==4.20.0
jsonschema-specifications==2023.12.1
kiwisolver==1.4.5
langcodes==3.3.0
loguru==0.7.2
markdown-it-py==3.0.0
MarkupSafe==2.1.3
matplotlib==3.8.2
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.15
murmurhash==1.0.10
networkx==3.0
ninja==1.11.1.1
numpy==1.24.1
orjson==3.9.10
packaging==23.2
pandas==2.1.4
Pillow==9.3.0
preshed==3.0.9
protobuf==4.25.2
psutil==5.9.7
py-cpuinfo==9.0.0
pyarrow==14.0.2
pyarrow-hotfix==0.6
pydantic==2.5.3
pydantic_core==2.14.6
pydeck==0.8.1b0
pydub==0.25.1
Pygments==2.17.2
pynvml==11.5.0
pyparsing==3.1.1
python-dateutil==2.8.2
python-multipart==0.0.6
pytz==2023.3.post1
PyYAML==6.0.1
referencing==0.32.1
regex==2023.12.25
requests==2.28.1
rich==13.7.0
rpds-py==0.16.2
s3transfer==0.10.0
safetensors==0.4.1
scipy==1.11.4
seaborn==0.13.1
semantic-version==2.10.0
sentencepiece==0.1.99
shellingham==1.5.4
six==1.16.0
smart-open==6.4.0
smmap==5.0.1
sniffio==1.3.0
spacy==3.7.2
spacy-legacy==3.0.12
spacy-loggers==1.0.5
srsly==2.4.8
starlette==0.35.1
streamlit==1.30.0
SwissArmyTransformer==0.4.9
sympy==1.12
tenacity==8.2.3
tensorboardX==2.6.2.2
thinc==8.2.2
timm==0.9.12
tokenizers==0.15.0
toml==0.10.2
tomlkit==0.12.0
toolz==0.12.0
torch==2.1.2+cu121
torchaudio==2.1.2+cu121
torchvision==0.16.2+cu121
tornado==6.4
tqdm==4.66.1
transformers==4.36.2
triton @ https://huggingface.co/MonsterMMORPG/SECourses/resolve/main/triton-2.1.0-cp310-cp310-win_amd64.whl
typer==0.9.0
typing_extensions==4.9.0
tzdata==2023.4
tzlocal==5.2
urllib3==1.26.13
uvicorn==0.25.0
validators==0.22.0
wasabi==1.1.2
watchdog==3.0.0
weasel==0.3.4
webdataset==0.2.86
websockets==11.0.3
win32-setctime==1.1.0
xformers==0.0.23.post1
xxhash==3.4.1
yarl==1.9.4
zipp==3.17.0

(venv) G:\CogVLM\CogVLM\venv\Scripts>

Who can help? / 谁可以帮助到您?

@zRzRzRzRzRzRzR @1049451037 @wenyihong @mactavish91 @lykeven @duchenzhuang

Information / 问题信息

FurkanGozukara commented 10 months ago

the error looks like downloading file part : File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\resources\download.py", line 50, in auto_create

FurkanGozukara commented 10 months ago

after deleting C:\Users\King\.triton\

still this error


[2024-01-12 00:03:13,644] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-12 00:03:13,743] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs.
Please build and install Nvidia apex package with option '--cuda_ext' according to https://github.com/NVIDIA/apex#from-source .
bin G:\CogVLM\CogVLM\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda121.dll
Exception ignored in: <function BaseFileLock.__del__ at 0x000001A1416DF2E0>
Traceback (most recent call last):
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\filelock\_api.py", line 240, in __del__
    self.release(force=True)
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\filelock\_api.py", line 201, in release
    with self._thread_lock:
AttributeError: 'WindowsFileLock' object has no attribute '_thread_lock'
Traceback (most recent call last):
  File "G:\CogVLM\CogVLM\basic_demo\web_demo.py", line 234, in <module>
    main(args)
  File "G:\CogVLM\CogVLM\basic_demo\web_demo.py", line 165, in main
    model, image_processor, cross_image_processor, text_processor_infer = load_model(args)
  File "G:\CogVLM\CogVLM\basic_demo\web_demo.py", line 65, in load_model
    model, model_args = AutoModel.from_pretrained(
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\model\base_model.py", line 337, in from_pretrained
    return cls.from_pretrained_base(name, args=args, home_path=home_path, url=url, prefix=prefix, build_only=build_only, overwrite_args=overwrite_args, **kwargs)
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\model\base_model.py", line 312, in from_pretrained_base
    model_path = auto_create(name, path=home_path, url=url)
  File "G:\CogVLM\CogVLM\venv\lib\site-packages\sat\resources\download.py", line 50, in auto_create
    lock = FileLock(model_path + '.lock', mode=0o777)
TypeError: BaseFileLock.__init__() got an unexpected keyword argument 'mode'
Press any key to continue . . .
FurkanGozukara commented 10 months ago

ok i made it work with a custom app

loading as 4 bit right now

are there any disadvantages or how much quality we lose with 4bit or 8bit compared to 16bit?

image

FurkanGozukara commented 10 months ago

One another question I have is

I am working on a tutorial for using cogagent-vqa for image captioning for Stable Diffusion training.

Could you let me know optimal parameters for captioning images? it will be 1 question and answer

Also is cogagent-vqa best for this? do_sample : false or true? top_p : what value? top_k : what value? temperature : what value?

can you tell me default values to set them?

zRzRzRzRzRzRzR commented 10 months ago

For Question:

The answer is:

For your second question, I can answer this:

For the default parameters, they are set according to the default values in our online experience address: do_sample = false temperature=0.9 top_k = 2

If you want to set do_sample = True Then top_p = 0.8

zRzRzRzRzRzRzR commented 10 months ago

By the way, any models with -chat(for example cogvlm-chat) are Image understanding chat models. If you are using the last chat model, you will not have any text to image functionality.

FurkanGozukara commented 10 months ago

By the way, any models with -chat(for example cogvlm-chat) are Image understanding chat models. If you are using the last chat model, you will not have any text to image functionality.

yes i know. i am looking for best image to text model. understanding image.

is cogagent-vqa is the best model to understand image? thank you

zRzRzRzRzRzRzR commented 10 months ago

If you are looking for an open source, high-resolution image understanding model, then Cogagent will be a very good choice currently.

FurkanGozukara commented 10 months ago

@zRzRzRzRzRzRzR if you can verify this i appreciate a lot

Currently I am using THUDM/cogagent-vqa-hf model

This is being the best image understanding model am i right?

I used the example code you provided on Hugging Face

MODEL_PATH="THUDM/cogagent-vqa-hf"
tokenizer = LlamaTokenizer.from_pretrained('lmsys/vicuna-7b-v1.5')

def post(input_text, temperature, top_p, top_k, image_prompt, do_sample):
    try:
        with torch.no_grad():
            image = Image.open(image_prompt).convert('RGB') if image_prompt is not None else None

            input_by_model = model.build_conversation_input_ids(tokenizer, query=input_text, history=[], images=([image] if image else None), template_version='base')
            inputs = {
                'input_ids': input_by_model['input_ids'].unsqueeze(0).to(DEVICE),
                'token_type_ids': input_by_model['token_type_ids'].unsqueeze(0).to(DEVICE),
                'attention_mask': input_by_model['attention_mask'].unsqueeze(0).to(DEVICE),
                'images': [[input_by_model['images'][0].to(DEVICE).to(torch_type)]],
            }
            if 'cross_images' in input_by_model and input_by_model['cross_images']:
                inputs['cross_images'] = [[input_by_model['cross_images'][0].to(DEVICE).to(torch_type)]]

            gen_kwargs = {
                "max_length": 2048,
                "temperature": temperature,
                "do_sample": do_sample,
                "top_p": top_p,
                "top_k": top_k
            }
            outputs = model.generate(**inputs, **gen_kwargs)
            outputs = outputs[:, inputs['input_ids'].shape[1]:]
            response = tokenizer.decode(outputs[0])
            response = response.split("</s>")[0]
            return response
    except Exception as e:
        return str(e)
zRzRzRzRzRzRzR commented 10 months ago

This is best if you are just doing vqa, not continuous dialogue, if you want to use dialogue you should use this model https://huggingface.co/THUDM/cogagent-chat-hf

FurkanGozukara commented 10 months ago

This is best if you are just doing vqa, not continuous dialogue, if you want to use dialogue you should use this model https://huggingface.co/THUDM/cogagent-chat-hf

I am doing 0 shot no continuous chat. Thank you.