chn-lee-yumi / MaterialSearch

AI语义搜索本地素材。以图搜图、查找本地素材、根据文字描述匹配画面、视频帧搜索、根据画面描述搜索视频。Semantic search. Search local photos and videos through natural language.
GNU General Public License v3.0
863 stars 117 forks source link

设置OFA-Sys/chinese-clip-vit-large-patch14-336px模型时,报错ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 768 is different from 512) #114

Closed lynneTest closed 2 months ago

lynneTest commented 2 months ago

### 系统: Linux

### 部署方式: 从源码部署

### 配置:

********** 运行配置 / RUNNING CONFIGURATIONS **********
HOST: '0.0.0.0'
PORT: 8085
ASSETS_PATH: ('/home/[usrname]/datasets/PASCAL-VOC-2007',)
SKIP_PATH: ('/tmp',)
IMAGE_EXTENSIONS: ('.jpg', '.jpeg', '.png', '.gif', '.heic', '.webp', '.bmp')
VIDEO_EXTENSIONS: ('.mp4', '.flv', '.mov', '.mkv', '.webm', '.avi')
IGNORE_STRINGS: ('thumb', 'avatar', '__macosx', 'icons', 'cache')
FRAME_INTERVAL: 2
SCAN_PROCESS_BATCH_SIZE: 8
IMAGE_MIN_WIDTH: 64
IMAGE_MIN_HEIGHT: 64
AUTO_SCAN: False
AUTO_SCAN_START_TIME: (22, 30)
AUTO_SCAN_END_TIME: (8, 0)
AUTO_SAVE_INTERVAL: 100
MODEL_NAME: 'OFA-Sys/chinese-clip-vit-large-patch14-336px'
DEVICE: 'cuda'
CACHE_SIZE: 64
POSITIVE_THRESHOLD: 36
NEGATIVE_THRESHOLD: 36
IMAGE_THRESHOLD: 85
LOG_LEVEL: 'INFO'
SQLALCHEMY_DATABASE_URL: 'sqlite:///./instance/assets.db'
TEMP_PATH: './tmp'
VIDEO_EXTENSION_LENGTH: 0
ENABLE_LOGIN: False
USERNAME: 'admin'
PASSWORD: 'MaterialSearch'
FLASK_DEBUG: False
HF_HOME: None
HF_HUB_OFFLINE: None
TRANSFORMERS_OFFLINE: None
CWD: /home/[username]/MaterialSearch
**************************************************
/home/[username]/anaconda3/envs/ms_env/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
 * Serving Flask app 'main'
 * Debug mode: off
2024-09-02 09:21:06,948 werkzeug INFO WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:8085
 * Running on http://192.168.xx.xxx:8085
2024-09-02 09:21:06,948 werkzeug INFO Press CTRL+C to quit

### 报错情况: 在webui点击扫描,输入dog,阈值设置为20、50,以文搜图功能,点击搜索,返回如下: 图片1 终端返回报错信息如下:

2024-09-02 09:21:31,726 main ERROR Exception on /api/match [POST]
Traceback (most recent call last):
  File "/home/zhuml/anaconda3/envs/ms_env/lib/python3.10/site-packages/flask/app.py", line 1473, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/zhuml/anaconda3/envs/ms_env/lib/python3.10/site-packages/flask/app.py", line 882, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/zhuml/anaconda3/envs/ms_env/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/zhuml/anaconda3/envs/ms_env/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/home/zhuml/MaterialSearch/main.py", line 69, in wrapper
    return view_func(*args, **kwargs)
  File "/home/zhuml/MaterialSearch/main.py", line 167, in api_match
    results = search_image_by_text(data["positive"], data["negative"], positive_threshold, negative_threshold)
  File "/home/zhuml/MaterialSearch/search.py", line 89, in search_image_by_text
    return search_image_by_feature(positive_feature, negative_feature, positive_threshold, negative_threshold)
  File "/home/zhuml/MaterialSearch/search.py", line 57, in search_image_by_feature
    scores = match_batch(positive_feature, negative_feature, features, positive_threshold, negative_threshold)
  File "/home/zhuml/MaterialSearch/process_assets.py", line 249, in match_batch
    positive_scores = new_features @ new_text_positive_feature.T
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 768 is different from 512)

### 我的尝试: 1.config.py设置OFA-Sys/chinese-clip-vit-base-patch16和openai/clip-vit-base-patch16模型,可以正常运行main.py、正常使用webui以文搜图功能。 2.config.py设置OFA-Sys/chinese-clip-vit-large-patch14-336px模型,在VSCode使用python debugger进行调试,断点查看没有发现size为512的矩阵,可以正常使用webui以文搜图功能。 3.config.py设置OFA-Sys/chinese-clip-vit-large-patch14-336px模型,终端运行main.py,webui进行以文搜图,就会报上述ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 768 is different from 512)错误。

lynneTest commented 2 months ago

sorry, 有人问过了