Open yun189 opened 4 days ago
can I do other experiment ,such as cross-modal abilities, omni-cross abilities?
run inference_demo.py,print:['a man riding skis down a snow covered slope. a man is speaking with background noise and breathing sounds.'],which describe example/test.jpg. Is it right?
load_from_pretrained: ./MiCo-g/ckpt/model_step_319989.pt Please 'pip install xformers' Please 'pip install xformers' Please 'pip install xformers' WARNING:model.bert:If you want to use
BertForMaskedLM
make sureconfig.is_decoder=False
for bi-directional self-attention. Unexpected keys [] missing_keys [] /home/liran/miniforge3/envs/MiCo_py39/lib/python3.9/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True). warnings.warn( tensor([[0.1206], [0.0043]], device='cuda:0', grad_fn=) tensor([0.7154, 0.0451], device='cuda:0', grad_fn=) /home/liran/miniforge3/envs/MiCo_py39/lib/python3.9/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn("None of the inputs have requires_grad=True. Gradients will be None") ['a man riding skis down a snow covered slope. a man is speaking with background noise and breathing sounds.']
Yes, you can simply refer to the image, audio, and video. The generated caption is precise.
can I do other experiment ,such as cross-modal abilities, omni-cross abilities?
Exactly, you can use it for any experiments you want, since it has been well-pretrained.
when testing other tasks,i met many problems…… will you make details public?
So, how can I know what tasks you require to use pretrained models? I do not know your problems at all.
such as estimate depth , i donot know how to process input data
when i want to test depth task, image_input [1,1,3,224,224] ,output=model.forward_depth_encoder(image_input) ,output.torch.size='[257,1408]',what does it means? how can i got depth image? i donot know now.
run inference_demo.py,print:['a man riding skis down a snow covered slope. a man is speaking with background noise and breathing sounds.'],which describe example/test.jpg. Is it right?
load_from_pretrained: ./MiCo-g/ckpt/model_step_319989.pt Please 'pip install xformers' Please 'pip install xformers' Please 'pip install xformers' WARNING:model.bert:If you want to use)
tensor([0.7154, 0.0451], device='cuda:0', grad_fn=)
/home/liran/miniforge3/envs/MiCo_py39/lib/python3.9/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
['a man riding skis down a snow covered slope. a man is speaking with background noise and breathing sounds.']
BertForMaskedLM
make sureconfig.is_decoder=False
for bi-directional self-attention. Unexpected keys [] missing_keys [] /home/liran/miniforge3/envs/MiCo_py39/lib/python3.9/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True). warnings.warn( tensor([[0.1206], [0.0043]], device='cuda:0', grad_fn=