Open SunTongtongtong opened 1 year ago
======>Auto Resize Image...
Resize image form 493x512 to 512x512
/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/transformers/generation/utils.py:1313: UserWarning: Using max_length
's default (20) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using max_new_tokens
to control the maximum length of the generation.
warnings.warn(
Processed ImageCaptioning, Input Image: image/aac0c16e.png, Output Text: a puppy sitting in a truck with hay
Processed run_image, Input image: image/aac0c16e.png Current state: [('image/aac0c16e.png', 'Received. ')] Current Memory: Human: provide a figure named image/aac0c16e.png. The description is: a puppy sitting in a truck with hay. This information helps you to understand this image, but you should use tools to finish following tasks, rather than directly imagine from my description. If you understand, say "Received". AI: Received. history_memory: Human: provide a figure named image/aac0c16e.png. The description is: a puppy sitting in a truck with hay. This information helps you to understand this image, but you should use tools to finish following tasks, rather than directly imagine from my description. If you understand, say "Received". AI: Received. , n_tokens: 48
Entering new AgentExecutor chain...
Action: Replace Something From The Photo
Action Input: image/aac0c16e.png, puppy, small dogimage_path=image/aac0c16e.png, to_be_replaced_txt= puppy
/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/transformers/modeling_utils.py:862: FutureWarning: The device
argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/torch/utils/checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
Traceback (most recent call last):
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/gradio/routes.py", line 412, in run_predict
output = await app.get_blocks().process_api(
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/gradio/blocks.py", line 1299, in process_api
result = await self.call_function(
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/gradio/blocks.py", line 1021, in call_function
prediction = await anyio.to_thread.run_sync(
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, args)
File "visual_chatgpt.py", line 1307, in run_text
res = self.agent({"input": text.strip()})
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/langchain/chains/base.py", line 168, in call
raise e
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/langchain/chains/base.py", line 165, in call
outputs = self._call(inputs)
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/langchain/agents/agent.py", line 503, in _call
next_step_output = self._take_next_step(
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/langchain/agents/agent.py", line 420, in _take_next_step
observation = tool.run(
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/langchain/tools/base.py", line 71, in run
raise e
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/langchain/tools/base.py", line 68, in run
observation = self._run(tool_input)
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/langchain/agents/tools.py", line 17, in _run
return self.func(tool_input)
File "visual_chatgpt.py", line 1233, in inference_replace_sam
masks = self.sam.get_mask_with_boxes(image_pil, image, boxes_filt)
File "visual_chatgpt.py", line 841, in get_mask_withboxes
masks, , _ = self.sam_predictor.predict_torch(
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(args, *kwargs)
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/segment_anything/predictor.py", line 229, in predict_torch
low_res_masks, iou_predictions = self.model.mask_decoder(
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(input, *kwargs)
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/segment_anything/modeling/mask_decoder.py", line 94, in forward
masks, iou_pred = self.predict_masks(
File "/import/sgg-homes/ss014/software/anaconda3/envs/visgpt_2/lib/python3.8/site-packages/segment_anything/modeling/mask_decoder.py", line 144, in predict_masks
masks = (hyper_in @ upscaled_embedding.view(b, c, h w)).view(b, -1, h, w)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 256, 256] because the unspecified dimension size -1 can be any value and is ambiguous
I find out the reason: the groundingDINO object detector cannot work in case of loading Text2Box_cuda on other cuda card. I change to load on 0, it works. Don't understand why this happens though.
I find out the reason: the groundingDINO object detector cannot work in case of loading Text2Box_cuda on other cuda card. I change to load on 0, it works. Don't understand why this happens though.
I met the same issue. When I load the Text2Box on cuda:2, there also a process will be created on cuda:0. I don't know if part of Text2Box has to run on cuda:0.
please tell me how to deal with this problem.
Hello there,
I try to generate image with my uploaded image and modified it with text input. I load as #373 suggested:
python visual_chatgpt.py --load "Text2Box_cuda:1,Segmenting_cuda:1,Inpainting_cuda:0,ImageCaptioning_cuda:0"
However I receive the bug as titled.When I load all the models as default in readme:
The server can run but generate totally unsimilar images.
Can anyone help?