eric-ai-lab / MiniGPT-5

Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
https://eric-ai-lab.github.io/minigpt-5.github.io/
Apache License 2.0
853 stars 52 forks source link

[BUG]Still not working, there is an error (TypeError: unsupported opera type (s) for//: 'NoneType' and 'int') when running Python playground. py #35

Open lckj2009 opened 11 months ago

lckj2009 commented 11 months ago

Hello, there is an error (TypeError: unsupported operator type (s) for//: 'NoneType' and 'int') running Python playground.py

Operating system: ubuntu 20.04

Python 3.9.18

Other parameters: Same as MiniGPT-5/requirements. txt

All three ckpt files are located in MiniGPT-5/config. The configuration files have all been changed. The weight used is Vicuna-7b-v1.1. However, the following error still occurred.

Run "Python playground. py --stage1_weight /root/MiniGPT-5/config/stage1_cc3m.ckpt --test_weight /root/MiniGPT-5/config/stage2_vist.ckpt" The following error occurred during command execution:

Seed set to 42 Loading VIT Traceback (most recent call last): File "/root/MiniGPT-5/examples/playground.py", line 40, in minigpt5 = MiniGPT5_Model.load_from_checkpoint(stage1_ckpt, strict=False, map_location="cpu", encoder_model_config=model_args, vars(training_args)) File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/module.py", line 1552, in load_from_checkpoint loaded = _load_from_checkpoint( File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/saving.py", line 89, in _load_from_checkpoint model = _load_state(cls, checkpoint, strict=strict, kwargs) File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/saving.py", line 156, in _load_state obj = cls(*_cls_kwargs) File "/root/MiniGPT-5/model.py", line 68, in init self.model = MiniGPT5.from_config(minigpt4_config.model_cfg) File "/root/MiniGPT-5/minigpt4/models/mini_gpt4.py", line 247, in from_config model = cls( File "/root/MiniGPT-5/minigpt4/models/mini_gpt5.py", line 46, in init super().init(args, *kwargs) File "/root/MiniGPT-5/minigpt4/models/mini_gpt4.py", line 53, in init self.visual_encoder, self.ln_vision = self.init_vision_encoder( File "/root/MiniGPT-5/minigpt4/models/blip2.py", line 65, in init_vision_encoder visual_encoder = create_eva_vit_g( File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 416, in create_eva_vit_g model = VisionTransformer( File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 259, in init self.patch_embed = PatchEmbed( File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init num_patches = (img_size[1] // patch_size[1]) (img_size[0] // patch_size[0]) TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

lckj2009 commented 11 months ago

this is my pip list:

accelerate 0.24.1 aiofiles 23.2.1 aiohttp 3.8.4 aiosignal 1.3.1 altair 5.2.0 annotated-types 0.6.0 antlr4-python3-runtime 4.9.3 anyio 3.7.1 appdirs 1.4.4 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 2.4.1 async-lru 2.0.4 async-timeout 4.0.2 attrs 22.2.0 Babel 2.13.1 backoff 2.2.1 beautifulsoup4 4.12.2 bleach 6.1.0 blessed 1.20.0 blis 0.7.11 boto3 1.33.2 botocore 1.33.2 braceexpand 0.1.7 catalogue 2.0.10 cchardet 2.1.7 certifi 2023.11.17 cffi 1.16.0 chardet 5.1.0 charset-normalizer 3.3.2 click 8.1.7 cmake 3.27.7 comm 0.2.0 confection 0.1.4 contourpy 1.0.7 croniter 1.4.1 cycler 0.11.0 cymem 2.0.8 dateutils 0.6.12 debugpy 1.8.0 decorator 5.1.1 decord 0.6.0 deepdiff 6.7.1 defusedxml 0.7.1 diffusers 0.23.1 docker-pycreds 0.4.0 exceptiongroup 1.2.0 executing 2.0.1 fastapi 0.104.1 fastjsonschema 2.19.0 ffmpy 0.3.1 filelock 3.9.0 fonttools 4.38.0 fqdn 1.5.1 frozenlist 1.3.3 fsspec 2023.10.0 ftfy 6.1.3 gitdb 4.0.11 GitPython 3.1.40 gradio 3.50.0 gradio_client 0.6.1 h11 0.14.0 httpcore 1.0.2 httpx 0.25.2 huggingface-hub 0.19.4 idna 3.6 importlib-metadata 6.8.0 importlib-resources 5.12.0 inquirer 3.1.4 iopath 0.1.10 ipykernel 6.27.1 ipython 8.18.1 isoduration 20.11.0 itsdangerous 2.1.2 jedi 0.19.1 Jinja2 3.1.2 jmespath 1.0.1 joblib 1.3.2 json5 0.9.14 jsonpointer 2.4 jsonschema 4.20.0 jsonschema-specifications 2023.11.1 jupyter_client 8.6.0 jupyter_core 5.5.0 jupyter-events 0.9.0 jupyter-lsp 2.2.1 jupyter_server 2.11.1 jupyter_server_terminals 0.4.4 jupyterlab 4.0.9 jupyterlab_pygments 0.3.0 jupyterlab_server 2.25.2 kiwisolver 1.4.4 langcodes 3.3.0 lightning 2.0.9.post0 lightning-cloud 0.5.55 lightning-utilities 0.10.0 linkify-it-py 2.0.2 lit 17.0.6 llvmlite 0.41.1 markdown-it-py 2.2.0 MarkupSafe 2.1.3 matplotlib 3.7.0 matplotlib-inline 0.1.6 mdit-py-plugins 0.3.3 mdurl 0.1.2 mistune 3.0.2 mpmath 1.3.0 multidict 6.0.4 murmurhash 1.0.10 nbclient 0.9.0 nbconvert 7.11.0 nbformat 5.9.2 nest-asyncio 1.5.8 networkx 3.2.1 nltk 3.8.1 notebook 7.0.6 notebook_shim 0.2.3 numba 0.58.1 numpy 1.26.2 nvidia-cublas-cu11 11.10.3.66 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu11 8.5.0.96 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu11 10.9.0.58 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu11 10.2.10.91 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu11 11.7.4.91 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu11 2.14.3 nvidia-nccl-cu12 2.18.1 nvidia-nvjitlink-cu12 12.3.101 nvidia-nvtx-cu11 11.7.91 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 open-clip-torch 2.23.0 opencv-python 4.8.1.78 ordered-set 4.1.0 orjson 3.9.10 overrides 7.4.0 packaging 23.0 pandas 2.1.3 pandocfilters 1.5.0 parso 0.8.3 pathy 0.10.3 peft 0.6.2 pexpect 4.9.0 Pillow 10.1.0 pip 23.3.1 platformdirs 4.0.0 portalocker 2.8.2 preshed 3.0.9 prometheus-client 0.19.0 prompt-toolkit 3.0.41 protobuf 4.25.1 psutil 5.9.4 ptyprocess 0.7.0 pure-eval 0.2.2 pycocoevalcap 1.2 pycocotools 2.0.6 pycparser 2.21 pydantic 1.10.13 pydantic_core 2.4.0 pydub 0.25.1 Pygments 2.17.2 PyJWT 2.8.0 pynndescent 0.5.11 pyparsing 3.0.9 python-dateutil 2.8.2 python-editor 1.0.4 python-json-logger 2.0.7 python-multipart 0.0.6 pytorch-fid 0.3.0 pytorch-lightning 2.1.2 pytz 2023.3.post1 PyYAML 6.0 pyzmq 25.1.1 readchar 4.0.5 referencing 0.31.0 regex 2022.10.31 requests 2.31.0 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.7.0 rouge 1.0.1 rpds-py 0.13.1 s3transfer 0.8.1 safetensors 0.4.1 scikit-learn 1.3.2 scipy 1.11.4 semantic-version 2.10.0 Send2Trash 1.8.2 sentence-transformers 2.2.2 sentencepiece 0.1.99 sentry-sdk 1.37.1 setproctitle 1.3.3 setuptools 68.0.0 six 1.16.0 smart-open 6.4.0 smmap 5.0.1 sniffio 1.3.0 soupsieve 2.5 spacy 3.5.1 spacy-legacy 3.0.12 spacy-loggers 1.0.5 srsly 2.4.8 stack-data 0.6.3 starlette 0.27.0 starsessions 1.3.0 sympy 1.12 tenacity 8.2.2 terminado 0.18.0 thinc 8.1.12 threadpoolctl 3.2.0 timm 0.6.13 tinycss2 1.2.1 tokenizers 0.13.3 tomli 2.0.1 toolz 0.12.0 torch 2.0.1 torch-fidelity 0.3.0 torchmetrics 1.2.0 torchvision 0.15.2 tornado 6.3.3 tqdm 4.64.1 traitlets 5.14.0 transformers 4.31.0 triton 2.0.0 typer 0.7.0 types-python-dateutil 2.8.19.14 typing_extensions 4.8.0 tzdata 2023.3 uc-micro-py 1.0.2 umap-learn 0.5.5 uri-template 1.3.0 urllib3 1.26.18 uvicorn 0.24.0.post1 wandb 0.16.0 wasabi 1.1.2 wcwidth 0.2.12 webcolors 1.13 webdataset 0.2.48 webencodings 0.5.1 websocket-client 1.6.4 websockets 11.0.3 wheel 0.41.2 xformers 0.0.22 yarl 1.8.2 zipp 3.14.0

lckj2009 commented 11 months ago

this is my Weight files, The following figure:

1

lckj2009 commented 11 months ago

this is my /root/MiniGPT-5/config/minigpt4.yaml: model: arch: minigpt5 model_type: pretrain_vicuna freeze_vit: True freeze_qformer: True max_txt_len: 160 end_sym: "###" prompt_path: "" prompt_template: '###Human: {} ###Assistant: ' ckpt: '/root/MiniGPT-5/config/prerained_minigpt4_7b.pth' using_lora: True

datasets: cc_sbu_align: vis_processor: train: name: "blip2_image_eval" image_size: 224 text_processor: train: name: "blip_caption"

run: task: image_text_pretrain

optimizer

lr_sched: "linear_warmup_cosine_lr" init_lr: 3e-5 min_lr: 1e-5 warmup_lr: 1e-6

weight_decay: 0.05 max_epoch: 5 iters_per_epoch: 200 batch_size_train: 12 batch_size_eval: 12 num_workers: 4 warmup_steps: 200

seed: 42 output_dir: "output/minigpt4_stage2_finetune"

amp: True resume_ckpt_path: null

evaluate: False train_splits: ["train"]

device: "cuda" world_size: 1 dist_url: "env://" distributed: True

lckj2009 commented 11 months ago

this is my /root/MiniGPT-5/minigpt4/configs/models/minigpt4.yaml:

model: arch: mini_gpt4

vit encoder

image_size: 224 drop_path_rate: 0 use_grad_checkpoint: False vit_precision: "fp16" freeze_vit: True freeze_qformer: True

Q-Former

num_query_token: 32

Vicuna

llama_model: "/root/vicuna-7b-v1.1"

generation configs

prompt: ""

preprocess: vis_processor: train: name: "blip2_image_train" image_size: 224 eval: name: "blip2_image_eval" image_size: 224 text_processor: train: name: "blip_caption" eval: name: "blip_caption"

lckj2009 commented 11 months ago

this is my /root/vicuna-7b-v1.1, The following figure: 2

lckj2009 commented 11 months ago

Including "Vicuna-7b-v1.1" is all good. The path and configuration file are fine, but why is there still such an error. My pip "torch=2.0.1, lighting=2.0.9.post0"

KzZheng commented 11 months ago

I'm trying to help. According to your error record:

File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init
num_patches = (img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])
TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

where Img_size is None when the model is loaded.

But according to your config file, you have set img_size: 224 in /root/MiniGPT-5/minigpt4/configs/models/minigpt4.yaml. Then, I'm confused. Can you check the minigpt4_config.model_cfg.image_size after line 67?

lckj2009 commented 11 months ago

May I ask which file has line 67?/root/MiniGPT-5/model.py 67 line? 3

KzZheng commented 11 months ago

Yes

lckj2009 commented 11 months ago

Yes

Please wait a moment, I need to set up a debugging environment

lckj2009 commented 11 months ago

File "/root/MiniGPT-5/model.py", line 68, in init print('minigpt4_config.model_cfg.image_size' + str(minigpt4_config.model_cfg.image_size)) File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 355, in getattr self._format_and_raise( File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/base.py", line 231, in _format_and_raise format_and_raise( File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/_utils.py", line 899, in format_and_raise _raise(ex, cause) File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/_utils.py", line 797, in _raise raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 351, in getattr return self._get_impl( File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 442, in _get_impl node = self._get_child( File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/basecontainer.py", line 73, in _get_child child = self._get_node( File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 480, in _get_node raise ConfigKeyError(f"Missing key {key!s}") omegaconf.errors.ConfigAttributeError: Missing key image_size full_key: model.image_size object_type=dict

Seems unable to: print('minigpt4_config.model_cfg.image_size' + str(minigpt4_config.model_cfg.image_size))

4

KzZheng commented 11 months ago

Then, can you check the self.args.cfg_path in line 27 of minigpt4/common/config.py and model_config_path in line 71 of minigpt4/common/config.py

lckj2009 commented 11 months ago

File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init num_patches = (img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0]) TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

Because these errors were executed first, an error was reported before reaching "minipt4/common/config. py".

Here are the modifications I made, but the program has not yet been executed here。 5

lckj2009 commented 11 months ago

We are currently creating the environment and will use 'pycharm' to debug once it is ready。Then I will take a screenshot to show you the situation

KzZheng commented 11 months ago

This is weird. According to your error:

Traceback (most recent call last):
File "/root/MiniGPT-5/examples/playground.py", line 40, in
minigpt5 = MiniGPT5_Model.load_from_checkpoint(stage1_ckpt, strict=False, map_location="cpu", encoder_model_config=model_args, **vars(training_args))
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/module.py", line 1552, in load_from_checkpoint
loaded = _load_from_checkpoint(
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/saving.py", line 89, in _load_from_checkpoint
model = _load_state(cls, checkpoint, strict=strict, kwargs)
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/saving.py", line 156, in _load_state
obj = cls(_cls_kwargs)
File "/root/MiniGPT-5/model.py", line 68, in init
self.model = MiniGPT5.from_config(minigpt4_config.model_cfg)
File "/root/MiniGPT-5/minigpt4/models/mini_gpt4.py", line 247, in from_config
model = cls(
File "/root/MiniGPT-5/minigpt4/models/mini_gpt5.py", line 46, in init
super().init(*args, **kwargs)
File "/root/MiniGPT-5/minigpt4/models/mini_gpt4.py", line 53, in init
self.visual_encoder, self.ln_vision = self.init_vision_encoder(
File "/root/MiniGPT-5/minigpt4/models/blip2.py", line 65, in init_vision_encoder
visual_encoder = create_eva_vit_g(
File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 416, in create_eva_vit_g
model = VisionTransformer(
File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 259, in init
self.patch_embed = PatchEmbed(
File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init
num_patches = (img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])
TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

Your error starts from File "/root/MiniGPT-5/model.py", line 68, but we are testing line 67. The error should be after config. Also, you already tried to print minigpt4_config.model_cfg.image_size before and you did not receive any error in line 67.

lckj2009 commented 11 months ago

This is weird. According to your error:


Traceback (most recent call last):

Your error starts from `File "/root/MiniGPT-5/model.py", line 68`, but we are testing line 67. The error should be after config. Also, you already tried to print `minigpt4_config.model_cfg.image_size` before and you did not receive any error in line 67.

yes,Start the next call with "self.model = MiniGPT5.from_config(minigpt4_config.model_cfg)". Before this, there were no errors reported。

About errors:“(img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])” I once discovered that ‘img_size[1]’ and ‘img_size[0]’ The value of these two objects is none

lckj2009 commented 11 months ago

I found that the values of ‘img_size[1]’ and ‘img_size[0]’ are none, and I think this is the reason for the error

lckj2009 commented 11 months ago

Because img_size[1]=None and img_size[0]=None, an error of "TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'" occurred

KzZheng commented 11 months ago

I'm trying to help. According to your error record:

File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init
num_patches = (img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])
TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

where Img_size is None when the model is loaded.

But according to your config file, you have set img_size: 224 in /root/MiniGPT-5/minigpt4/configs/models/minigpt4.yaml. Then, I'm confused. Can you check the minigpt4_config.model_cfg.image_size after line 67?

Same like I said here. img_size should not be none and the all print I asked above is to check the reason about why it is None here.

lckj2009 commented 11 months ago

Now it can be DEBUG

img_size in minigpt4_config.datasets_cfg.cc_sbu_align.vis_processor.train.img_size。 is not minigpt4_config.model_cfg

img_size==224. is right. NO ERROR

6

7

KzZheng commented 11 months ago

But it should be also inside both model_cfg and dataset_cfg.

image

lckj2009 commented 11 months ago

But it should be also inside both model_cfg and dataset_cfg.

image

Is there a problem with my 'minigpt4.yaml' file or did I read the wrong file? Could you please take a look at the address and content of the 'minigpt4.yaml' file in my reply above

KzZheng commented 11 months ago

this is my /root/MiniGPT-5/minigpt4/configs/models/minigpt4.yaml:

model: arch: mini_gpt4

vit encoder

image_size: 224 drop_path_rate: 0 use_grad_checkpoint: False vit_precision: "fp16" freeze_vit: True freeze_qformer: True

Q-Former

num_query_token: 32

Vicuna

llama_model: "/root/vicuna-7b-v1.1"

generation configs

prompt: ""

preprocess: vis_processor: train: name: "blip2_image_train" image_size: 224 eval: name: "blip2_image_eval" image_size: 224 text_processor: train: name: "blip_caption" eval: name: "blip_caption"

I didn't see the wrong config in your file. To check whether you read the correct file. You should check the model_config_path at line 71 of minigpt4/common/config.py file first.

lckj2009 commented 11 months ago

An error occurred before running to 'self.tokenizer = self.model.llama_tokenizer'. The following error is in 'eva_vit.py', please analyze it. Please take a look at the breakpoint debugging results

8 9

lckj2009 commented 11 months ago

I made a breakpoint here, but I didn't come in。

10

KzZheng commented 11 months ago

I think that's maybe the reason. You have multiple minigpt4 folder under your python path. Therefore, the python loads minigpt4 from the wrong path/folder. You can step in the line 67 to see where Config(MiniGPT4Args) leads to.

lckj2009 commented 11 months ago

restart。 /root/minigpt555/minigpt4/common/config.py 27 line. minigpt4.yaml is OK. 11

/root/minigpt555/minigpt4/common/config.py 71 line. minigpt4.yaml is OK. 12

/root/minigpt555/model.py 67 line. minigpt4.yaml is OK. 13

They all point to the path: /root/MiniGPT-5/config/minigpt4.yaml

/root/minigpt555 and /root/MiniGPT-5 is same files

KzZheng commented 11 months ago

model_config_path in /root/minigpt555/minigpt4/common/config.py 71 line. should be /root/MiniGPT-5/minigpt4/configs/models/minigpt4.ymal instead of /root/MiniGPT-5/config/minigpt4.yaml

Please check the function default_config_path in Line 79 of minigpt4/models/base_model.py to see why you obtain the wrong path.

lckj2009 commented 11 months ago

Please check the function default_config_path in Line 79 of minigpt4/models/base_model.py to see why you obtain the wrong path.

**Thank you, the problem has been resolved.

But a new error has occurred:**

Traceback (most recent call last): File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection raise err File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection sock.connect(sa) OSError: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connectionpool.py", line 715, in urlopen httplib_response = self._make_request( File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connectionpool.py", line 404, in _make_request self._validate_conn(conn) File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1058, in _validate_conn conn.connect() File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connection.py", line 363, in connect self.sock = conn = self._new_conn() File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f2252a4d8b0>: Failed to establish a new connection: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/requests/adapters.py", line 486, in send resp = conn.urlopen( File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connectionpool.py", line 799, in urlopen retries = retries.increment( File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /stabilityai/stable-diffusion-2-1-base/resolve/main/text_encoder/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f2252a4d8b0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1247, in hf_hub_download metadata = get_hf_file_metadata( File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn return fn(args, kwargs) File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1624, in get_hf_file_metadata r = _request_wrapper( File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 402, in _request_wrapper response = _request_wrapper( File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 425, in _request_wrapper response = get_session().request(method=method, url=url, params) File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, send_kwargs) File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, kwargs) File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 63, in send return super().send(request, args, **kwargs) File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/requests/adapters.py", line 519, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /stabilityai/stable-diffusion-2-1-base/resolve/main/text_encoder/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f2252a4d8b0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 2236cb4e-1384-4f70-882e-68340d88ead0)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/transformers/utils/hub.py", line 417, in cached_file resolved_file = hf_hub_download( File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn return fn(*args, **kwargs) File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1377, in hf_hub_download raise LocalEntryNotFoundError( huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/transformers/configuration_utils.py", line 672, in _get_config_dict resolved_config_file = cached_file( File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/transformers/utils/hub.py", line 452, in cached_file raise EnvironmentError( OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like stabilityai/stable-diffusion-2-1-base is not the path to a directory containing a file named text_encoder/config.json. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'. python-BaseException

**I wonder if I want to download 'https://huggingface.co/julien-c/EsperBERTo-small/resolve/main/pytorch_model.bin'. The network environment here is not good, so I want to download 'pytorch_model.bin' first and put it in a local folder. But I don't know which folder is better to put it in, please let me know, thank you.

If it's not this file, please tell me the other file names so that I can download it and place it locally.**

lckj2009 commented 11 months ago

I have now cloned 'stablityai/table diffusion 2-1-base', should I also put it in the/root/MiniGPT-5 directory? Where exactly is it placed?

KzZheng commented 11 months ago

I have now cloned 'stablityai/table diffusion 2-1-base', should I also put it in the/root/MiniGPT-5 directory? Where exactly is it placed?

You can place it anywhere you want. Just change sd_model_name at line 73 of model.py to your local path.

lckj2009 commented 11 months ago

I have now cloned 'stablityai/table diffusion 2-1-base', should I also put it in the/root/MiniGPT-5 directory? Where exactly is it placed?

You can place it anywhere you want. Just change sd_model_name at line 73 of model.py to your local path.

Now the models are all installed. Result error: CUDA out of memory.

File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/root/minigpt555/minigpt4/models/modeling_llama.py", line 140, in forward return self.down_proj(self.act_fn(self.gate_proj(x)) self.up_proj(x)) File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB (GPU 0; 21.99 GiB total capacity; 21.38 GiB already allocated; 17.00 MiB free; 21.66 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF python-BaseException

I remember the 'python3 playground.py --stage1_weight WEIGHT_FOLDER/stage1_cc3m.ckpt ' command, it doesn't take up much memory. My server has a single card with 24GB of graphics memory. 2 graphics cards

1

lckj2009 commented 11 months ago

Regarding the error:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 21.99 GiB total capacity; 21.42 GiB already allocated; 107.00 MiB free; 21.57 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I once tried to reduce the video memory fragment to 32MB, but it sfailed, and it reported the same error. I estimate we need you to find another way to help solve it:

export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:32