final text_encoder_type: bert-base-uncased
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Model loaded from /home/yoke/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/6fb3434d67548d71747b1ab3a32051d27a30c71f/groundingdino_swint_ogc.pth
=> _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight'])
/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/transformers/modeling_utils.py:830: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
text_encoder/model.safetensors not found
Fetching 16 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 5680.93it/s]
/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.
warnings.warn(
`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.
0%| | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/gradio/routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/gradio/blocks.py", line 1075, in process_api
result = await self.call_function(
File "/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/gradio/blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/data/yoke/Grounded-Segment-Anything/gradio_app.py", line 254, in run_grounded_sam
image = pipe(prompt=inpaint_prompt, image=image_pil, mask_image=mask_pil).images[0]
File "/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/data/yoke/anaconda3/envs/py310/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py", line 854, in __call__
latent_model_input = torch.cat([latent_model_input, mask, masked_image_latents], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 64 but got size 109 for tensor number 2 in the list.