Closed yhqqxq closed 4 months ago
不应该啊。。。你的.env文件放在哪里?你如何运行程序?在哪个目录运行?
放在MaterialSearch根目录了 DEVICE=cuda 这一行是起作用的,可以看到确实改成了gpu模式 下面这行就不行,启动main.py就会提示无法连接
我没有办法排查,你可以用替代解决方案:设置Windows系统的环境变量
好的,我试试把
这个问题我会记录下来,以后有机会再排查。
可以试试在run.bat
开头添加这行:
SET TRANSFORMERS_OFFLINE=1
我重新测试了一遍,无法复现你的问题。我的测试是生效的。
已在环境变量
、bat脚本里面
里设置了 TRANSFORMERS_OFFLINE
值,脚本里面也检测到了。但是依旧出现报错
(llama2codeinterpreter) V:\ai\MaterialSearchWindows>benchmark.bat
(llama2codeinterpreter) V:\ai\MaterialSearchWindows>set TRANSFORMERS_OFFLINE=1
(llama2codeinterpreter) V:\ai\MaterialSearchWindows>python benchmark.py
********** 运行配置 / RUNNING CONFIGURATIONS **********
HOST: '0.0.0.0'
PORT: 8085
ASSETS_PATH: ('C:\\Users\\Administrator\\Pictures', 'C:\\Users\\Administrator\\Videos')
SKIP_PATH: ('/tmp',)
IMAGE_EXTENSIONS: ('.jpg', '.jpeg', '.png', '.gif', '.heic', '.webp', '.bmp')
VIDEO_EXTENSIONS: ('.mp4', '.flv', '.mov', '.mkv', '.webm', '.avi', '.vob')
IGNORE_STRINGS: ('thumb', 'avatar', '__macosx', 'icons', 'cache', '.filerun.thumbnails', '.ts')
FRAME_INTERVAL: 2
SCAN_PROCESS_BATCH_SIZE: 8
IMAGE_MIN_WIDTH: 64
IMAGE_MIN_HEIGHT: 64
AUTO_SCAN: False
AUTO_SCAN_START_TIME: (22, 30)
AUTO_SCAN_END_TIME: (8, 0)
AUTO_SAVE_INTERVAL: 100
MODEL_NAME: 'OFA-Sys/chinese-clip-vit-base-patch16'
DEVICE: 'cpu'
CACHE_SIZE: 64
POSITIVE_THRESHOLD: 36
NEGATIVE_THRESHOLD: 36
IMAGE_THRESHOLD: 85
LOG_LEVEL: 'INFO'
SQLALCHEMY_DATABASE_URL: 'sqlite:///./instance/assets.db'
TEMP_PATH: './tmp'
VIDEO_EXTENSION_LENGTH: 0
ENABLE_LOGIN: False
USERNAME: 'Jackxwb'
PASSWORD: 'MaterialSearch'
FLASK_DEBUG: False
**************************************************
>>> TRANSFORMERS_OFFLINE = 1
**************************************************
Loading models[OFA-Sys/chinese-clip-vit-base-patch16]...
Traceback (most recent call last):
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\connection.py", line 203, in _new_conn
sock = connection.create_connection(
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
raise err
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\util\connection.py", line 73, in create_connection
sock.connect(sa)
TimeoutError: timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\connectionpool.py", line 790, in urlopen
response = self._make_request(
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\connectionpool.py", line 491, in _make_request
raise new_e
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\connectionpool.py", line 467, in _make_request
self._validate_conn(conn)
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\connectionpool.py", line 1096, in _validate_conn
conn.connect()
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\connection.py", line 611, in connect
self.sock = sock = self._new_conn()
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\connection.py", line 212, in _new_conn
raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x00000219EAAB4F70>, 'Connection to huggingface.co timed out. (connect timeout=10)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\requests\adapters.py", line 589, in send
resp = conn.urlopen(
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\connectionpool.py", line 844, in urlopen
retries = retries.increment(
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\urllib3\util\retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /OFA-Sys/chinese-clip-vit-base-patch16/resolve/main/model.safetensors (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x00000219EAAB4F70>, 'Connection to huggingface.co timed out. (connect timeout=10)'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "V:\ai\MaterialSearchWindows\benchmark.py", line 16, in <module>
clip_model = AutoModelForZeroShotImageClassification.from_pretrained(MODEL_NAME)
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\transformers\models\auto\auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\transformers\modeling_utils.py", line 3415, in from_pretrained
if not has_file(pretrained_model_name_or_path, safe_weights_name, **has_file_kwargs):
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\transformers\utils\hub.py", line 629, in has_file
r = requests.head(url, headers=headers, allow_redirects=False, proxies=proxies, timeout=10)
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\requests\api.py", line 100, in head
return request("head", url, **kwargs)
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\requests\sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\requests\sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\requests\adapters.py", line 610, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /OFA-Sys/chinese-clip-vit-base-patch16/resolve/main/model.safetensors (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x00000219EAAB4F70>, 'Connection to huggingface.co timed out. (connect timeout=10)'))
(llama2codeinterpreter) V:\ai\MaterialSearchWindows>pause
请按任意键继续. . .
已修改的文件:
\MaterialSearchWindows\config.py
,文件末尾添加
print(f">>> TRANSFORMERS_OFFLINE = {os.getenv('TRANSFORMERS_OFFLINE')}")
print("**************************************************")
\MaterialSearchWindows\benchmark.py
,15行修改为:
print(f"Loading models[{MODEL_NAME}]...")
可以看到控制台中的输出内容,TRANSFORMERS_OFFLINE = 1,但依旧出现报错。
python -V Python 3.10.14
开启魔法的情况下有提示,不知道是否有关
********** 运行配置 / RUNNING CONFIGURATIONS **********
HOST: '0.0.0.0'
PORT: 8085
ASSETS_PATH: ('C:\\Users\\Administrator\\Pictures', 'C:\\Users\\Administrator\\Videos')
SKIP_PATH: ('/tmp',)
IMAGE_EXTENSIONS: ('.jpg', '.jpeg', '.png', '.gif', '.heic', '.webp', '.bmp')
VIDEO_EXTENSIONS: ('.mp4', '.flv', '.mov', '.mkv', '.webm', '.avi', '.vob')
IGNORE_STRINGS: ('thumb', 'avatar', '__macosx', 'icons', 'cache', '.filerun.thumbnails', '.ts')
FRAME_INTERVAL: 2
SCAN_PROCESS_BATCH_SIZE: 8
IMAGE_MIN_WIDTH: 64
IMAGE_MIN_HEIGHT: 64
AUTO_SCAN: False
AUTO_SCAN_START_TIME: (22, 30)
AUTO_SCAN_END_TIME: (8, 0)
AUTO_SAVE_INTERVAL: 100
MODEL_NAME: 'OFA-Sys/chinese-clip-vit-base-patch16'
DEVICE: 'cpu'
CACHE_SIZE: 64
POSITIVE_THRESHOLD: 36
NEGATIVE_THRESHOLD: 36
IMAGE_THRESHOLD: 85
LOG_LEVEL: 'INFO'
SQLALCHEMY_DATABASE_URL: 'sqlite:///./instance/assets.db'
TEMP_PATH: './tmp'
VIDEO_EXTENSION_LENGTH: 0
ENABLE_LOGIN: False
USERNAME: 'Jackxwb'
PASSWORD: 'MaterialSearch'
FLASK_DEBUG: False
**************************************************
>>> TRANSFORMERS_OFFLINE = 1
**************************************************
Loading models[OFA-Sys/chinese-clip-vit-base-patch16]...
D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Models loaded.
**************************************************
开始进行图像处理性能基准测试。用时越短越好。
这个问题确实很奇怪,我目前也没有头绪。
@Jackxwb 能否检查一下HF_HOME
这个环境变量?看看路径是什么
@Jackxwb 能否检查一下
HF_HOME
这个环境变量?看看路径是什么
脚本修改内容
\MaterialSearchWindows\config.py
,文件末尾添加
print(f">>> TRANSFORMERS_OFFLINE = {os.getenv('TRANSFORMERS_OFFLINE')}")
print(f">>> HF_HOME = {os.getenv('HF_HOME')}")
print("**************************************************")
输出
>>> TRANSFORMERS_OFFLINE = 1
>>> HF_HOME = huggingface
**************************************************
@Jackxwb 麻烦再试下在config.py
末尾加这个,看看输出:
current_directory = os.getcwd()
huggingface_exists = os.path.isdir(os.path.join(current_directory, 'huggingface'))
print(current_directory, huggingface_exists)
期望True
。如果是True
我就不知道哪里有问题了。如果是False
还能再排查下
@Jackxwb 麻烦再试下在
config.py
末尾加这个,看看输出:current_directory = os.getcwd() huggingface_exists = os.path.isdir(os.path.join(current_directory, 'huggingface')) print(current_directory, huggingface_exists)
期望
True
。如果是True
我就不知道哪里有问题了。如果是False
还能再排查下
输出为:
**************************************************
>>> TRANSFORMERS_OFFLINE = 1
>>> HF_HOME = huggingface
**************************************************
V:\ai\MaterialSearchWindows True
**************************************************
该目录结构为:
V:\ai\MaterialSearchWindows\huggingface>tree
卷 新加卷 的文件夹 PATH 列表
卷序列号为 F8A2-83D9
V:.
└─hub
├─.locks
│ └─models--OFA-Sys--chinese-clip-vit-base-patch16
└─models--OFA-Sys--chinese-clip-vit-base-patch16
├─.no_exist
│ └─36e679e65c2a2fead755ae21162091293ad37834
├─refs
└─snapshots
└─36e679e65c2a2fead755ae21162091293ad37834
我注意到,报错信息里的model.safetensors
,这个文件,在我的磁盘上,它的大小是0
我发现在开启魔法的时候,也没有使用V:\ai\MaterialSearchWindows\huggingface
这个文件夹,它会下载到C:\Users\当前用户\.cache\huggingface
下面。如果我把这两个文件夹都删了,他会自己下回来到这里。
下载时的日志:
**************************************************
>>> TRANSFORMERS_OFFLINE = 0
>>> HF_HOME = huggingface
>>> force_download = None
**************************************************
V:\ai\MaterialSearchWindows True
**************************************************
Loading models[OFA-Sys/chinese-clip-vit-base-patch16]...
config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.01k/3.01k [00:00<00:00, 3.01MB/s]
D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\huggingface_hub\file_download.py:157: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\Jackxwb\.cache\huggingface\hub\models--OFA-Sys--chinese-clip-vit-base-patch16. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
pytorch_model.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 753M/753M [00:46<00:00, 16.1MB/s]
preprocessor_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 342/342 [00:00<?, ?B/s]
D:\ProgramData\anaconda3\envs\llama2codeinterpreter\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
vocab.txt: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 110k/110k [00:00<00:00, 365kB/s]
Models loaded.
**************************************************
不知道是否和huggingface_hub
版本有关
>pip show huggingface_hub
Name: huggingface-hub
Version: 0.23.2
Summary: Client library to download and publish models, datasets and other repos on the huggingface.co hub
Home-page: https://github.com/huggingface/huggingface_hub
Author: Hugging Face, Inc.
Author-email: julien@huggingface.co
License: Apache
Location: d:\programdata\anaconda3\envs\llama2codeinterpreter\lib\site-packages
Requires: filelock, fsspec, packaging, pyyaml, requests, tqdm, typing-extensions
Required-by: datasets, diffusers, gradio, gradio_client, tokenizers, transformers
我把 HF_HOME
设置为 ./huggingface
,它才会使用V:\ai\MaterialSearchWindows\huggingface
这个文件夹(文件夹为空,有魔法的时候,会下载到这里),但关掉魔法依旧报错
huggingface版本没问题。
我把 HF_HOME 设置为 ./huggingface,它才会使用V:\ai\MaterialSearchWindows\huggingface这个文件夹
感觉是个可能的解决方案
我注意到,报错信息里的model.safetensors,这个文件,在我的磁盘上,它的大小是0
这个仓库就是没有这个文件的,这是正常的
可以试试把HF_HOME
设置为 ./huggingface
,把huggingface这个文件夹删掉,重新在压缩包解压一个出来放回去,然后看看是否正常? @Jackxwb
huggingface版本没问题。
我把 HF_HOME 设置为 ./huggingface,它才会使用V:\ai\MaterialSearchWindows\huggingface这个文件夹
感觉是个可能的解决方案
我注意到,报错信息里的model.safetensors,这个文件,在我的磁盘上,它的大小是0
这个仓库就是没有这个文件的,这是正常的
可以试试把
HF_HOME
设置为./huggingface
,把huggingface这个文件夹删掉,重新在压缩包解压一个出来放回去,然后看看是否正常? @Jackxwb
一样的报错
刚刚不小心把huggingface
文件夹删除了,又没开启TRANSFORMERS_OFFLINE
联网,然后出现新报错里面提到了一个文档
https://huggingface.co/docs/transformers/installation#offline-mode
现在在看有没有其他办法
根据文档改了下代码,可以实现在 在线模式 里下载模型(在整合包里打包进去应该也行),另存一份到自定义目录里,然后离线模式下直接加载这个目录可以正常加载模型
benchmark.py:
isOFFline = os.getenv('TRANSFORMERS_OFFLINE')
print(f"Loading models[{MODEL_NAME}, offLine={isOFFline}]...")
clip_model = None
clip_processor = None
if(isOFFline):
clip_model = AutoModelForZeroShotImageClassification.from_pretrained("./onlineModel", local_files_only=True)
clip_processor = AutoProcessor.from_pretrained("./onlineModel", local_files_only=True)
print("Models loaded[OFFLINE].")
else:
clip_model = AutoModelForZeroShotImageClassification.from_pretrained(MODEL_NAME)
clip_processor = AutoProcessor.from_pretrained(MODEL_NAME)
print("Models loaded[ONLINE].")
# 保存模型,离线使用
clip_model.save_pretrained("./onlineModel")
clip_processor.save_pretrained("./onlineModel")
print("Models Saveed.")
日志输出
**************************************************
>>> TRANSFORMERS_OFFLINE = 1
>>> HF_DATASETS_OFFLINE = None
>>> HF_HOME = huggingface
>>> force_download = None
>>> local_files_only = None
**************************************************
V:\ai\MaterialSearchWindows True
**************************************************
Loading models[OFA-Sys/chinese-clip-vit-base-patch16, offLine=1]...
Models loaded[OFFLINE].
**************************************************
开始进行图像处理性能基准测试。用时越短越好。
把离线模式里的 ./onlineModel
换成 .\huggingface\hub\models--OFA-Sys--chinese-clip-vit-base-patch16\snapshots\36e679e65c2a2fead755ae21162091293ad37834
也能运行,就是路径比较长
嘶,好像直接把 MODEL_NAME
改成 路径就能离线加载了,都不用改代码了
这个应该是上游的问题。参考:https://github.com/huggingface/transformers/issues/30345 https://github.com/huggingface/transformers/issues/30469
前几天有一个commit修复了: https://github.com/huggingface/transformers/pull/31016/files#diff-82b93b530be62e40679876a764438660dedcd9cc9e33c2374ed21b14ebef5dbaL630
目前整合包的transformer版本是4.41.1,这个commit在4.41.2里(见:https://github.com/huggingface/transformers/releases/tag/v4.41.2 )。我重新打包一次,应该可以解决这个问题。
@Jackxwb 现在已经打好了新的包,能否测试一下是否正常? https://github.com/chn-lee-yumi/MaterialSearch/releases/tag/v0.0.0-20240603
@Jackxwb 现在已经打好了新的包,能否测试一下是否正常? https://github.com/chn-lee-yumi/MaterialSearch/releases/tag/v0.0.0-20240603
cpu模式可正常运行 run.bat
、benchmark.bat
#73 中的 LOG_LEVEL
也生效了
已经按照说明里的要求:设置了TRANSFORMERS_OFFLINE=1 可是不起作用,只要关闭代理,就会出错! 请问如何解决?
.env内容: DEVICE=cuda TRANSFORMERS_OFFLINE=1
报错显示: Traceback (most recent call last): File "G:\AI\AI\tool\MaterialSearch-m\env\lib\site-packages\requests\adapters.py", line 486, in send resp = conn.urlopen( File "G:\AI\AI\tool\MaterialSearch-m\env\lib\site-packages\urllib3\connectionpool.py", line 847, in urlopen retries = retries.increment( File "G:\AI\AI\tool\MaterialSearch-m\env\lib\site-packages\urllib3\util\retry.py", line 515, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /OFA-Sys/chinese-clip-vit-base-patch16/resolve/main/model.safetensors (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x00000277F5EDF460>, 'Connection to huggingface.co timed out. (connect timeout=10)'))